Scaling Down an Elasticsearch Cluster

Scaling Down an Elasticsearch Cluster

Elasticsearch must be resilient to the failures of individual nodes. It achieves this resilience by considering cluster-state updates to be successful after a quorum of nodes have accepted them. A quorum is a carefully-chosen subset of the master-eligible nodes in a cluster.

Quorums must be carefully chosen so the cluster cannot elect two independent masters which make inconsistent decisions, ultimately leading to data loss. to know more...

Preparations Before scaling down

  • Back up your cluster to have something to restore if things go wrong down the line.
curl -X PUT "localhost:9200/twitter/_settings?pretty" -H 'Content-Type: application/json' -d'
{
"index" : {
"number_of_replicas" : 1
}
}
'
  • Re-balance the cluster gracefully before you start scaling down.

Health

curl -X GET "localhost:9200/_cluster/health?pretty"

Expected Output

{
"cluster_name" : "\"es-data-cluster\"",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 1,
"number_of_data_nodes" : 1,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}

Shards

curl -X GET "localhost:9200/_cat/shards"

Expected Output

twitter 2 p STARTED    0   0b 172.18.0.2 es-node
twitter 1 p STARTED 0 0b 172.18.0.2 es-node
twitter 0 p STARTED 0 230b 172.18.0.2 es-node

When the cluster status is green and all shards are STARTED then you are good to go with scaling down.

Steps to scale down

  • Remove one data node — the cluster will go into the yellow state. Now observe the following

If the logs say Marking shards as stale that means shard which is no more available for assignment and will be removed. Then the elastic search in-build capabilities start re-balancing the Shards.

curl -X GET "localhost:9200/_cluster/allocation/explain?pretty"

This command will provide explanations for shard allocations in the cluster in detail.

  • Wait for green — then the cluster has replicated the lost shards.

The cluster health is red so there is at least one unassigned primary shard. You need to focus on an unassigned cluster.

Reference

[1]: https://www.elastic.co/guide/en/elasticsearch/reference/7.0/modules-discovery-quorums.html
[2]: https://www.elastic.co/blog/a-new-era-for-cluster-coordination-in-elasticsearch
[3]: https://www.elastic.co/guide/en/elasticsearch/reference/current/disk-allocator.html

[4]: https://blog.mapillary.com/tech/2017/01/12/scaling-down-an-elasticsearch-cluster.html

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store