In a previous blog post I presented how the Log Management architecture fits in a NetEye cluster, and now I want to summarize my recent experiences to help you diagnose Elasticsearch health issues.
Elasticsearch provides a set of APIs which help to identify and debug a number of potential causes. But NetEye Log Management is secured by the Search Guard Compliance Edition, which tracks log data accesses and enforces security policies to fulfill the requirements of the GDPR. This requires that you pass security-oriented parameters when using the Elasticsearch API.
To make using the API easier, we created a set of utilities which provide all the additional parameters such as authentication certificates to interact with the Elastic Stack secured by Search Guard. In particular, to access the Elasticsearch APIs I recommend you use /usr/share/neteye/scripts/searchguard/sg_curl.sh, which is a wrapper around the curl utility.
Details on this and other related scripts can be found in NetEye itself by navigating to User Guide > Log Manager > Search Guard Helper Tools.
The simplest status of an Elasticsearch cluster is retrieved via the cluster health API by running the following command:
# /usr/share/neteye/scripts/searchguard/sg_curl.sh -X GET "https://elasticsearch.neteyelocal:9200/_cluster/health?pretty"
If the status is red or yellow, your indices are not operating correctly. In particular, yellow means that at least one shard replica is not allocated, while red means that at least one primary shard is not correctly started. In other words, if the status is yellow or red, there is a real risk of losing data if something goes wrong.
Understanding the index and shard status helps to identify cluster problems.
When a cluster is not working well, often the cause is an index in poor health which consequently is caused by one or more unallocated shards.
# /usr/share/neteye/scripts/searchguard/sg_curl.sh -s -X GET "https://elasticsearch.neteyelocal:9200/_cat/indices?v"
If an index is marked with a “yellow” or “red” status, we can check the status of the index’s shards:
# /usr/share/neteye/scripts/searchguard/sg_curl.sh -s -X GET "https://elasticsearch.neteyelocal:9200/_cat/shards/<index in red or yellow>"
Shards contain the actual data in an Elasticsearch cluster, and can be relocated or replicated to different cluster nodes. If they cannot be allocated for any reason (e.g., not enough disk space, or because one or more nodes are not connected), then one or more shards are not STARTED.
More information can be found at User Guide > Troubleshooting > Elasticsearch is not functioning properly.