Centralize your logs with ELK Stack
Centralizing logs with the ELK Stack is a powerful way to streamline and optimize the management of log data within an organization. The ELK Stack, comprised of Elasticsearch, Logstash, and Kibana, offers a comprehensive solution for collecting, parsing, storing, and visualizing logs from various sources in a centralized location. This setup not only enhances visibility into system behaviors and application performance but also facilitates troubleshooting, debugging, and proactive monitoring. Centralizing logs with the ELK Stack empowers businesses to gain valuable insights, detect anomalies, and make data-driven decisions more effectively.
In this article, We will setup a ELK stack to gathering Nginx Web Service logs from source node and visualize.
The efficient way to setup the stack. We should separate each service to single node.
But in this article, We will use a single node for entire stack with below specs:
RAM: 8GB
CPU: 8 vCPU (1 cores per socket)
Storage: 200GB
OS: Ubuntu 22.04 LTS
So, Let's start to configure the stack one by one.
Elasticsearch:
curl -fsSL https://artifacts.elastic.co/GPG-KEY-elasticsearch |sudo gpg --dearmor -o /usr/share/keyrings/elastic.gpg
echo "deb [signed-by=/usr/share/keyrings/elastic.gpg] https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list
apt update && apt install elasticsearch
systemctl start elasticsearch
curl -XGET "localhost:9200"
If you can access properly the endpoint of elasticsearch, you will find similar output like below:
Output
{
"name" : "ELK-STACK",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "TOkdWnIlTNCEMzCzxYZcog",
"version" : {
"number" : "7.17.15",
"build_flavor" : "default",
"build_type" : "deb",
"build_hash" : "0b8ecfb4378335f4689c4223d1f1115f16bef3ba",
"build_date" : "2023-11-10T22:03:46.987399016Z",
"build_snapshot" : false,
"lucene_version" : "8.11.1",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
Kibana:
apt update && apt install kibana
systemctl start kibana && systemctl status kibana
nano /etc/nginx/conf.d/your_domain.conf
server {
listen 80;
server_name your_domain;
auth_basic "Restricted Access";
auth_basic_user_file /etc/nginx/htpasswd.users;
location / {
proxy_pass https://localhost:5601;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
echo "kibanaadmin:`openssl passwd -apr1`" | sudo tee -a /etc/nginx/htpasswd.users
nginx -t
systemctl restart nginx
In order to establish connection with Elasticsearch, the Kibana will prompt with enrollment token request popup box. You can generate one bu the help of elasticsearch-create-enrollment-token tool. Just run below command and place the token in the popup box.
/usr/share/elasticsearch/bin/elasticsearch-create-enrollment-token -s kibana
After this step, Kibana will ask a verification pin for securely configuration. Run below command to get verification pin:
/usr/share/kibana/bin/kibana-verification-code
If all steps done successfully, Kibana will redirect you home page.
Logstash:
领英推荐
apt update && apt install logstash
After installing Logstash, you can move on to configuring it. Logstash’s configuration files reside in the /etc/logstash/conf.d directory. For more information on the configuration syntax, you can check out the configuration reference that Elastic provides. As you configure the file, it’s helpful to think of Logstash as a pipeline which takes in data at one end, processes it in one way or another, and sends it out to its destination (in this case, the destination being Elasticsearch). A Logstash pipeline has two required elements, input and output, and one optional element, filter. The input plugins consume data from a source, the filter plugins process the data, and the output plugins write the data to a destination.
Before the configuration of logstash, let's talk about filebeat which we will be use in this article.
What is filebeat and its purposes?
Filebeat is a lightweight data shipping tool offered by Elastic, the same company behind Elasticsearch, Kibana, and Logstash (ELK Stack). Its primary purpose is to collect, parse, and forward log data or other structured data from various sources to Elasticsearch or Logstash for further processing, indexing, and visualization.
Here's a brief breakdown of Filebeat's functionalities:
In summary, Filebeat plays a crucial role in the ELK Stack ecosystem by facilitating the efficient and reliable movement of log data from multiple sources to the centralized Elasticsearch or Logstash for indexing and analysis, contributing to effective log management and monitoring.
So, let's install the filebeat and configure to ships the defined logs defined by us.
apt update && apt install filebeat
The package should be installed and configured each nodes separately. The configuration of the service located in '/etc/filebeat/filebeat.yml'.
First of all, comment out the elasticsearch section and uncomment the logstash section like below. By the help of this changing, instead of sending logs directly to the elasticsearch service, filebeat will send to logstash service and logstash will filter out the logs according to defined configuration that we will do in the next steps.
#output.elasticsearch:
# Array of hosts to connect to.
# hosts: ["localhost:9200"]
# Protocol - either `http` (default) or `https`.
#protocol: "https"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
#username: "elastic"
#password: "changeme"
output.logstash:
# The Logstash hosts
hosts: ["ip_of_logstash:5044"]
# Optional SSL. By default is off.
# List of root certificates for HTTPS server verifications
#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]
# Certificate for SSL client authentication
#ssl.certificate: "/etc/pki/client/cert.pem"
# Client Certificate Key
#ssl.key: "/etc/pki/client/cert.key"
Now just add the below config under filebeat.inputs block to set which log files will filebeat ships to the logstash service:
- type: log
enabled: true
paths:
- /var/log/nginx/access.log
fields:
log_type: nginx-access
log_type field will help us to handle log in the logstash configuration.
After the configuration, save the file, start the filebeat service in installed node:
systemctl start filebeat
Create configuration file under /etc/logstash/conf.d directory. In this article, we will create a single config for each node that collects the log data. So, the node hostname based naming will be good practice for us. But be aware, every filename should end with .conf extension to logstash can read them.
Example config content:
input {
beats {
port => 5044
}
}
filter {
# Add any specific filters for processing if needed
if [fields][log_type] == 'nginx-access' {
grok {
match => {
"message" => "%{IPORHOST:client_ip} - %{DATA:user_ident} \[%{HTTPDATE:timestamp}\] \"%{WORD:http_method} %{DATA:request} HTTP/%{NUMBER:http_version}\" %{NUMBER:response_code:int} %{NUMBER:body_sent_bytes:int}"
}
}
date {
match => ["timestamp", "dd/MMM/yyyy:HH:mm:ss Z"]
target => "@timestamp"
}
mutate {
add_field => { "source_type" => "%{[fields][log_type]}" }
rename => {
"response_code" => "http_response_code"
"client_ip" => "user_client_ip"
"body_sent_bytes" => "bytes_sent"
}
convert => {
"http_response_code" => "integer"
"bytes_sent" => "integer"
}
}
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "hostname-%{+YYYY.MM.dd}"
}
}
In this configuration, input section determines in which port the logs will be collecting(Based on filebeat configuration, the port number will be 5044), filter block filters the access log of nginx to more readable and suitable for visualization. Output section will send the filtered logs to the elasticsearch service. After the configuration just restart the below services.
Index value in the output blocks tells the Kibana in which source the logs came from. It will be different based on the nodes that sends the logs.
systemctl restart logstash elasticsearch
After the successful restarting, check the status of the index that we have set in the configuration by the help of below curl command in the elasticsearch node.
curl -XGET "localhost:9200/_cat/indices?v"
In the output, you should see the index name. If you can see the name, it means you had successfully configure all of the steps.
So, let's visualize the logs with kibana.
Access the dashboard with browser. Go to Hamburger menu at the left side->Management->Stack management.
In the Stack Management, go to index patterns page at the left side bar. Create index pattern and give the name and select with the '@timestamp' option in select field. The index pattern name should be same with name of index in the logstash configuration. Logstash will add date at the end of a given index name. It means that you should cover all the dates like 'index_name-*'. You should see the name in the right side source list.
If you created the index pattern, go to hamburger menu->Analytics->Discover page. Select the index pattern at the top left select field. All the logs will be appear in the discover menu.
Left panel shows the all fields that containing log data, blacked section shows all logs over each second. If you can get the logs in this panel, it means that you can properly visualize the data. Just go to hamburger menu->dashboard->create dashboard->Create visualization and select index pattern at the top left side. And the help of official tutorial video you can visualize the data based on your desires.
Example visualized dashboard:
Centralizing logs with the ELK Stack presents a transformative approach to log management. By leveraging Elasticsearch, Logstash, and Kibana in unison, organizations can streamline the collection, parsing, storage, and visualization of logs from diverse sources. This centralized approach not only enhances visibility into system behaviors and application performance but also empowers proactive monitoring, troubleshooting, and data-driven decision-making. ELK Stack’s unified platform offers scalability, flexibility, and the ability to glean actionable insights from vast amounts of log data, making it an indispensable tool for modern businesses striving for enhanced operational efficiency and robust data analysis.
References: