Elasticsearch Cluster setup
Installing and configuring Elasticsearch cluster for production purpose.
Elastic search is one of the popular full-text search engine, used for real time distributed search and analysis of data.
In my previous post, we have seen “How to install and configure Elasticsearch in single node” . Single node will be used for development purpose, if you have large amount of data, but still that is also not recommended. If you are dealing with small data set, you can prefer single node. If you are dealing with large set of data and you want 0 downtime of availability, then I prefer Clustering. Here you will add more nodes and install elasticsearch in all the node and make it as a data node.
Adding node to elasticsearch cluster is very simple, when compared to other clustering application in the market at present. You can increase resource along with your data grows. This method is same as horizontal scaling up resources.
For forming a Elasticsearch cluster you need to have at-least 3 nodes, and maximum depends on your purpose and data.
Why am saying 3 is the minimum you need, see the below mentioned role each node is going to play.
Node 1 – Make it as client node, where you can’t store any data.
Node 2 – Make it as a master node, still you can store data.
Node 3 – Make it as a data node, on failure of master node, data node will become master and serves the client.
In this tutorial am going to use 5 node cluster. and using 5 – “Red Hat Enterprise Linux Server release 6.4 (Santiago)” machines. But this article is common for all the Linux system, because am using the TAR package to install and configuration.
First down and extract Elasticsearch from official site. or use the below command to download using wget and extract it.
wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.3/elasticsearch-2.3.3.tar.gz wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/{VERSION}/elasticsearch-{VERSION}.tar.gz tar -xvf elasticsearch-2.3.3.tar.gz tar -xvf elasticsearch-{VERSION}.tar.gz
Once you have download and extracted it, lets configure it form an cluster.
Note that, here am going to use 1 client node and 1 master node and 4 data node. Mater node is also going to serve as a data node. But in production cluster, if you want a dedicated master node, you can disable that option and make it serves as a master node.
Note: you can also make more than 1 node as Master, but that is not recommended as it works in a split brain method. Elasticsearch by default has elect policy, which will elect a new master node, if the actual master is unavailable.
All you need to know is, this single configuration file ./config/elasticsearch.yml
Configuring Elasticsearch Cluster
Cluster name
First lets name our cluster, this should be unique to cluster.
uncomment the following line and name you cluster, like I did below. place the same cluster name across all nodes in your cluster
cluster.name: Production-KL
Node name
This section is unique to each node, each node must have different name to identify each node. If you left it blank, elasticsearch will assign random name.
I have placed all the 5 name below , by naming 1 to 5, or you can give you hostname also.
node.name: KL_ES_Node_1 node.name: KL_ES_Node_2 node.name: KL_ES_Node_3 node.name: KL_ES_Node_4 node.name: KL_ES_Node_5 or node.name: ${HOSTNAME}
Once you have added the name of the node, add the below config in the same node section itself. This to set whether the node is client, master or data node.
Am going to make my Node 1 as client and node 5 as master and rest 3 and master node as data node.
Please add the following in your client node to make it client.
Elasticsearch cluster – Client node setup
node.client: true node.data: false
Form the above config, we have made 1 node as client, which can be used as point of contact to POST or GET data.
Note: you can’t make your client node as data node, if you enable it node.data: true, on starting up you will get the below exception. This is not supported in Elasticsearch.
Exception in Elastisearch cluster setup – when enable client to serve data node.
Exception in thread "main" java.lang.IllegalStateException: node is not configured to store local location at org.elasticsearch.env.NodeEnvironment.nodeDataPaths(NodeEnvironment.java:633) at org.elasticsearch.gateway.GatewayMetaState.ensureNoPre019ShardState(GatewayMetaState.java:242) at org.elasticsearch.gateway.GatewayMetaState.<init>(GatewayMetaState.java:77) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at <<<guice>>> at org.elasticsearch.node.Node.<init>(Node.java:213) at org.elasticsearch.node.Node.<init>(Node.java:140) at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143) at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:178) at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:270) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)
Note: Here in Elasticsearch client will act as a load balancer role.
Elasticsearch cluster – Data node setup
node.client: false node.data: true
Data node will hold data and perform data operation, like CRUD, search and aggregations.
Elasticsearch cluster – Master node setup
node.master: true node.data: true
You can enable node.data: true to store data in master node also, but not in client node. If you want dedicated master node, you can disable node.data: false.
Master node is the one which creates, delete indices and controls the cluster. If your master node goes down, elasticsearch automatically elects new master nodes and serves the client queries perfectly fine.
Elasticsearch cluster Network section
Set the network.host to the node IP. Set the below parameter on the respective nodes.
# client node network.host: 10.1.14.10 # manster node + data node network.host: 10.1.14.11 #data nodes network.host: 10.1.14.12 network.host: 10.1.14.13 network.host: 10.1.14.14
Elasticsearch cluster – Zen Discovery
This setting is to discover the all the other nodes in the cluster using unicast ping discovery and disable multicast.
Set this parameter across the cluster, instead of IP, you can also have hostname also.
discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["10.1.14.10", "10.1.14.11", "10.1.14.12", "10.1.14.13", "10.1.14.14"] # instead of IP, enter hostname also discovery.zen.ping.unicast.hosts: ["elasticsearch-client-node", "elasticsearch-master-node", "elasticsearch-data-node1", "elasticsearch-data-node1", "elasticsearch-data-node1"]
That’s it you are done, you have successfully configured the Elasticsearch cluster. If you want to see cluster and manage in UI, install the below plugin using default plugin manager in a client node.
Installing in a client node is more than enough.
./bin/plugin install mobz/elasticsearch-head
After all the configuration changes made across the cluster, start the elasticsearch in all nodes in the below order.
- Master node
- Data nodes or Client node
First you should start the Master node, otherwise elasticsearch will elect Master node by default.
Once you have started, check the cluster and all the nodes are added up. issue the below query in REST Client or in browser.
http://10.1.14.10:9200/_cluster/state?pretty=true or http://elasticsearch-client-node:9200/_cluster/state?pretty=true
you will get output similar to the below one.
Elasticsearch Cluster state
{ "cluster_name" : "Production-KL", "version" : 18, "state_uuid" : "UKRFnQ2IQwiGSGCACDmzbA", "master_node" : "3FZi6C3OSWm3n8waqCGesg", "blocks" : { }, "nodes" : { "t61dJNrERLqTZ_M0Bet-nw" : { "name" : "KL_ES_Node_2", "transport_address" : "10.1.14.11:9300", "attributes" : { } }, "0PvMd_UmTmCS1A9uuCZtLg" : { "name" : "KL_ES_Node_1", "transport_address" : "10.1.14.10:9300", "attributes" : { "client" : "true", "data" : "false" } }, "Xrczz3luQnuxQywsk9bAQQ" : { "name" : "KL_ES_Node_4", "transport_address" : "10.1.14.13:9300", "attributes" : { } }, "xpElIrA_RdOHedAfDJ-oDg" : { "name" : "KL_ES_Node_3", "transport_address" : "10.1.14.12:9300", "attributes" : { } }, "3FZi6C3OSWm3n8waqCGesg" : { "name" : "KL_ES_Node_5", "transport_address" : "10.1.14.14:9300", "attributes" : { } } }, "metadata" : { "cluster_uuid" : "kEnRpUIkSQGNTUNna_iUCA", "templates" : { }, "indices" : { } }, "routing_table" : { "indices" : { } }, "routing_nodes" : { "unassigned" : [ ], "nodes" : { "xpElIrA_RdOHedAfDJ-oDg" : [ ], "t61dJNrERLqTZ_M0Bet-nw" : [ ], "3FZi6C3OSWm3n8waqCGesg" : [ ], "Xrczz3luQnuxQywsk9bAQQ" : [ ] } } }
Elasticsearch cluster nodes and details
http://10.1.14.10:9200/_nodes/process?pretty=true [OR] http://elasticsearch-client-node:port/_nodes/process?pretty=tue
Output in json format below
{ "cluster_name" : "Production-KL", "nodes" : { "UEUjWn0OSreVQUH-2H_lIg" : { "name" : "KL_ES_Node_5", "transport_address" : "10.1.14.14:9300", "host" : "10.1.14.14", "ip" : "10.1.14.14", "version" : "2.3.3", "build" : "218bdf1", "http_address" : "10.1.14.14:9200", "attributes" : { "master" : "true" }, "process" : { "refresh_interval_in_millis" : 1000, "id" : 1025, "mlockall" : false } }, "te7wSKECQrWTa8QESLf-NA" : { "name" : "sa-dpoc26", "transport_address" : "10.1.14.10:9300", "host" : "10.1.14.10", "ip" : "10.1.14.10", "version" : "2.3.3", "build" : "218bdf1", "http_address" : "10.1.14.10:9200", "attributes" : { "client" : "true", "data" : "false" }, "process" : { "refresh_interval_in_millis" : 1000, "id" : 26264, "mlockall" : false } }, "lJa4Zt9JQP6uXP_aMNpUkA" : { "name" : "KL_ES_Node_2", "transport_address" : "10.1.14.11:9300", "host" : "10.1.14.11", "ip" : "10.1.14.11", "version" : "2.3.3", "build" : "218bdf1", "http_address" : "10.1.14.11:9200", "process" : { "refresh_interval_in_millis" : 1000, "id" : 13700, "mlockall" : false } }, "2nDwSHxYQXC0qhZC3DBE-A" : { "name" : "KL_ES_Node_3", "transport_address" : "10.1.14.12:9300", "host" : "10.1.14.12", "ip" : "10.1.14.12", "version" : "2.3.3", "build" : "218bdf1", "http_address" : "10.1.14.12:9200", "process" : { "refresh_interval_in_millis" : 1000, "id" : 21446, "mlockall" : false } }, "5rMQOsUARcm9nopOB3MHDA" : { "name" : "KL_ES_Node_4", "transport_address" : "10.1.14.13:9300", "host" : "10.1.14.13", "ip" : "10.1.14.13", "version" : "2.3.3", "build" : "218bdf1", "http_address" : "10.1.14.13:9200", "process" : { "refresh_interval_in_millis" : 1000, "id" : 16491, "mlockall" : false } } } }
Elasticsearch GUI Plugin
Once you have installed GUI plugin in the client node, you can manage and monitor the Elasticsearch cluster from the browser.
http://10.1.14.10:9200/_plugin/head/ [OR] http://elasticsearch-client-node:port/_plugin/head/
Once you have use this url on the browser, you can see the cluster information in the browser. see the fig below.

Elasticsearch cluster head plugin