Optimal Shard Placement in a Petabyte Scale Elasticsearch Cluster
The number of shards on each node, and tries to balance the number of shards per node evenly across the clusterThe high and low disk watermarks. Elasticsearch considers the available disk space on a node before deciding whether to allocate new shards to that node or to actively relocate shards away from that node. A nodes that has reached the low watermark (i.e 80% disk used) is not allowed receive any more shards.
A node that has reached the high watermark (i.e 90%) will start to actively move shards away from it. The high and low disk watermarks. Elasticsearch considers the available disk space on a node before deciding whether to allocate new shards to that node or to actively relocate shards away from that node.
A nodes that has reached the low watermark (i.e 80% disk used) is not allowed receive any more shards. A node that has reached the high watermark (i.e 90%) will start to actively move shards away from it. New indices created and old indices dropping.
Disk watermark triggers due to indexing and other shard movements. Elasticsearch randomly deciding that a node has too few/too many shards compared to the cluster average. Hardware and OS-level failures causing new AWS instances to spin up and join the cluster.
With 500+ nodes this happens several times a week on average. New nodes added, almost every week because of normal data growth.
Source: meltwater.com