Fail-safe Logstash Pipelines
Logstash Fail-safe

Fail-safe Logstash Pipelines

This article talks about establishing fail-safe mechanism for Logstash pipelines.


Logstash is an open-source data collection engine with real-time pipelining capabilities. Logstash can dynamically unify data from disparate sources and normalize the data into destinations of your choice. Cleanse and democratize all your data for diverse advanced downstream analytics and visualization use cases. Logstash does not natively support High Availability.

As you know, many data pipelines are configured in a single Logstash server. If the Logstash server goes down for some reason, it will impact data ingestion of all the pipelines configured on that Logstash server, leading to absence of Realtime data into the dashboards. The impact may be huge depending on the outage duration irrespective of whether it is on-prem or cloud deployment.

Even if you have multiple Logstash servers, each Logstash server would be doing its own data pipeline ingestion. You cannot setup same data pipeline on two Logstash servers as it would lead to data duplication.

So, what's the solution to this problem? How can you continue the ingestions even if the Logstash server goes down? Let's see.


High-level steps:

1. Have another server (VM or node) running Logstash.

2. Use cron job to sync SQL tracking files (sql_last_value) between both Logstash servers, if using JDBC connection in Conf files.

3. Use Heartbeat to monitor Logstash service status & forward to Elasticsearch index (heartbeat index).

4. Setup watcher to detect the service down status from Elasticsearch index. The watcher keeps checking the heartbeat index every 5 minutes (customizable). If for any of the Logstash server, the count of Logstash service down is more than 10, then it calls failover script on respective server?through the webhook action.

5. Use script with commands to start / stop Logstash & use the appropriate pipeline configuration files.

6. Use webhook to take predefined actions based on the watcher Alerts.


Workflow – Service failure detection, failover & fallback


Pre-requisites:

  1. Heartbeat component installed & configured in remote server (Kibana server in this case) should have connectivity to Logstash server on port 9600.? Enable port 9600 on Logstash configuration file (logstash.yml) and restart Logstash service. Webhook service should be running on port 8080.
  2. User account that's running the script should have execute permission on folders/directories and scripts.


Team: Jyoti Kulkarni ( Jv Ku )



Srikandan Dakshinamoorthy

?? Cybersecurity & IT Governance || Strategic IT Auditor || SDM || Risk Management Expert ??

1 年

Awesome

回复

要查看或添加评论,请登录

Rajaraman Sathyamurthy的更多文章

社区洞察

其他会员也浏览了