Elastic Search Health Status Red Lesson learn Harder Way |
Hello! I’m Soumil Nitin Shah, a Software and Hardware Developer based in New York City. I have completed by Bachelor in Electronic Engineering and my Double master’s in Computer and Electrical Engineering. I Develop Python Based Cross Platform Desktop Application , Webpages , Software, REST API, Database and much more.
In this article i will share my personal experience working with elastic search and lesson i learned the harder way. i have been working on elastic search from past 6 months and lot of time i was dealing with health status as red such as failed to unassigned shards in this article i will share how to you can debug and get it right.
Step 1: Identify the Problem:
curl -XGET localhost:9200/_cluster/allocation/explain?pretty
Lets see the sample Response
{ "index" : "testing", "shard" : 0, "primary" : false, "current_state" : "unassigned", "unassigned_info" : { "reason" : "INDEX_CREATED", "at" : "2018-04-09T21:48:23.293Z", "last_allocation_status" : "no_attempt" }, "can_allocate" : "no", "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes", "node_allocation_decisions" : [ { "node_id" : "XXXXXXXXX", "node_name" : "XXXXXXXX", "transport_address" : "127.0.0.1:9300", "node_decision" : "no", "weight_ranking" : 1, "deciders" : [ { "decider" : "same_shard", "decision" : "NO", "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists" } ] } ] }
In this case, the API clearly explains why the replica shard remains unassigned: “the shard cannot be allocated to the same node on which a copy of the shard already exists”. To view more details about this particular issue and how to resolve it,
If it looks like the unassigned shards belong to an index you thought you deleted already, or an outdated index that you don’t need anymore, then you can delete the index to restore your cluster status to green:
Following are Reason
* Reason 1: Shard allocation is purposefully delayed
* Reason 2: Too many shards, not enough nodes
* Reason 3: You need to re-enable shard allocation
* Reason 4: Shard data no longer exists in the cluster
* Reason 5: Low disk watermark
* Reason 6: Multiple Elasticsearch versions
Solutions
After spending lot of hours on internet and reading lot of stackoverflow post here are some things you can try to get the cluster back to green \
- Clear the Cache
POST /<indexname>/_cache/clear
2. Increase max allocation retries
PUT <indexname>/_settings { "index.allocation.max_retries" : 10 }
3. Delete all the Scroll
DELETE /_search/scroll/_all
4. Increase the Timeout
PUT /<indexname>/_settings?pretty { "settings": { "index.unassigned.node_left.delayed_timeout": "10m" }
5 Increase the replica to 1 wait for some time and change it back to 0
PUT indexname/_settings { "index.number_of_replicas":1 } PUT <indexname>/_settings { "index.number_of_replicas":0 }
6 if you are managing your cluster and not AWS try re routing
POST /_cluster/reroute?prettyp { "commands" : [ { "allocate_empty_primary" : { "index" : "constant-updates", "shard" : 0, "node" : "<indexname>", "accept_data_loss" : "true" } } ] }
Last thing i would say if nothing work delete the cluster or index and start again
Some suggestions always it good idea to follow following things
Shards : How many Shards ? usually 25 Gb ~ per shards is a good idea so say you have 250GB of data in this case go for 10 Shards or 20 shards
Replica How many do you need ? usually its good idea to have 1 replica in QA and 3 Replica in PROD [Note data size will increase with your replica ]
Happy ELK
References :
- https://stackoverflow.com/questions/44383601/aws-elastic-search-forbidden-8-index-write-api-unable-to-write-to-index
- https://aws.amazon.com/premiumsupport/knowledge-center/elasticsearch-red-yellow-status/
- https://www.elastic.co/blog/red-elasticsearch-cluster-panic-no-longer
- https://stackoverflow.com/questions/48337264/elastic-search-cluster-is-shown-as-red-how-to-recover
- https://hellokangning.github.io/en/post/fixing-elasticsearch-with-red-status/
SOC | SIEM | VAPT | EDR | XDR | Incident Response | Sentinel | Wazuh | Elastic Stack |
3 年root@master:~# POST /logstash-2021.07.28/_cache/clear Please enter content (application/x-www-form-urlencoded) to be POSTed: what sholud i do?