Closely Watched Kafka. Monitoring Apache Kafka in Business Process Management

Closely Watched Kafka. Monitoring Apache Kafka in Business Process Management


?? Wersja polska


"Somebody must have made a false accusation against Joseph K., for he was arrested one morning without having done anything wrong."?[1] ?This - one of the better-known opening lines of the novel - comes from Franz Kafka's "The Trial", radiating with absurdity and dystopia. While Kafka's sense of threat arises from vague danger and lack of clear rules - in another classic example, Orwell's "1984" - the dystopian nature of the world is achieved primarily through the existence of permanent and all-knowing surveillance of citizens.

As is often the case – what would be unbearable for people is necessary for smooth, efficient, and secure operation in systems built with code. In today's article, we will discuss how the team working on Flowee – our proprietary BPMN platform that enables easy creation, management, and execution of broadly defined business processes – uses one of the technical elements of this platform working 'in the background,' Apache Kafka. And also how to monitor Kafka ??

[2] If you want to try the monitoring methods, log in to GitHub. We have a repository with examples...If you want to try out monitoring methods, log in to GitHub. We have a?repository with examples...


Flowee is a platform designed to minimise the need to write code when deploying new processes for the client. It achieves this through advanced graphical editors of processes and forms. If you want to learn more about it, visit?THE WEBSITE.


1. What is Apache Kafka?

Apache Kafka Apache Kafka is a messaging system designed to handle large volumes of real-time data. This means that the primary goal of this platform is to enable the creation of systems for which the ability to scale quickly and asynchronously in perceived real-time by the user is crucial. Kafka was initially created by LinkedIn and is currently an open-source project managed by the Apache Software Foundation. It is widely used for building scalable data stream processing applications, log aggregation, telemetry systems, data integration – in architectures based on microservices and many more. As evidence of Kafka's reliability and scalability, in 2019, LinkedIn, which uses Kafka, reported that its infrastructure based on it handles 7 trillion messages per day! For those who get lost in the number of zeros - a trillion is otherwise a million million or a thousand billion. Impressive!

2. Apache Kafka we Flowee

Flowee is a distributed system consisting of many microservices. Each of them is responsible for a small, clearly defined fragment of the logic of the entire system. As in any such architecture, ensuring reliable and efficient communication is crucial. To achieve this goal, we chose Apache Kafka when creating Flowee.

3. Why is monitoring the Kafka cluster crucial?

Monitoring Apache Kafka in a production environment is crucial for many reasons, both in terms of ensuring the reliability and performance of the system and minimising downtime in case of problems. Among them, we can distinguish the following issues:

  • Performance and resource optimisation:

Monitoring helps understand how Kafka resources are utilised and where bottlenecks may occur. This allows for more efficient performance optimisation and resource management.

  • Detecting and resolving issues:

Early detection of problems, such as delays in processing, Kafka broker overload, or network issues, is crucial for maintaining the continuity of the system's operation. Rapid identification and resolution of problems minimises negative impact on business operations.?

  • Scalability:

Monitoring allows an understanding of usage patterns and system throughput, which is essential for scaling planning. This information helps make decisions about expanding or modifying infrastructure to handle increased load.

  • Security:

Monitoring can help detect unusual patterns of behavior that may indicate security issues. (np. ataki DDoS czy próby nieautoryzowanego dost?pu).?

  • Compliance with SLA (Service Level Agreement) requirements:

Many organisations must meet specific requirements regarding the availability and performance of their services. Monitoring helps ensure that Kafka services and applications using them meet these requirements.?

  • Data flow optimisation:

In production environments processing large volumes of data, it is essential for the data flow through Kafka to be as efficient as possible. Monitoring allows for identifying and eliminating bottlenecks in data flow.


Monitoring Kafka in a production environment is a matter of maintaining infrastructure efficiency and a strategic element of managing the performance, security, and operational costs of systems based on this technology.


4. How do we monitor Kafka in Flowee?

Monitoring in Flowee is realised using two tools: Prometheus and Grafana. The former collects current measurements from defined sources, such as individual microservices or Kafka brokers. Grafana, on the other hand, allows visualisation of Prometheus's measurements in the form of graphs.

Here is a repository with a ready-to-run docker-compose configuration demonstrating the operation of the configurations described in the article.

After cloning the repository to run a sample environment, use the command:, you should use the command:?

docker compose -f compose.yaml up -d        

After they are launched, two panels are available in the browser:?

Grafana:?https://localhost:3000/dashboards?

Kafka UI:?https://localhost:8080?(password admin:admin)?

4.1. Monitoring Kafka brokers

Individual components of the Kafka cluster, i.e., brokers, are monitored by installing the jmx_exporter library (https://github.com/prometheus/jmx_exporter) on them. This allows individual processes to expose current measurements of monitored parameters via an endpoint on port 8888. The measured values are then periodically retrieved by the Prometheus server.?

Sample parameters that can be monitored using jmx_exporter library include: the number of active brokers, the number of active partitions, under-replicated partitions, the number of messages per second, CPU load, the amount of data sent to and received from each broker, the amount of data stored on disk by individual brokers, statistics of message producers and consumers, statistics regarding internal data replication between Kafka brokers, and many more.?

The configuration of jmx_exporter in Kafka brokers involves running a Java process with a so-called Java agent (-javaagent switch):

KAFKA_OPTS: '-javaagent:/bitnami/kafka/libs/jmx_prometheus_javaagent-0.20.0.jar=8888:/bitnami/kafka/libs/kafka_broker.yml'        

While the Prometheus configuration reading values provided by jmx_exporter looks like this:?

- job_name: "kafka-broker"
        static_configs:
          - targets:
            - kafka-1:8888
            - kafka-2:8888
            - kafka-3:8888
        labels:
            env: "dev"
         relabel_configs:
            - source_labels: [__address__]
            target_label: hostname
            regex: '([^:]+)(:[0-9]+)?'
            replacement: '${1}'        


4.2. Monitoring message consumption delays by consumers

One of the basic parameters to monitor in systems built using Kafka is the delay in message consumption.? High latency indicates that consumers are not keeping up with processing messages produced to topics in Kafka. This could be a sign of performance issues in consuming applications and requires corrective actions – such as increasing the number of microservice instances (scale-out) or optimization of the implementation of components responsible for data processing.? To collect statistics regarding consumption delays, we use the kafka-lag-exporter application:?https://github.com/seglo/kafka-lag-exporter It is defined in the compose.yaml file: compose.yaml:?

kafka-lag-exporter:
        image: seglo/kafka-lag-exporter:0.8.2
        volumes:
            - ${PWD}/kafka-lag-exporter:/opt/docker/conf        

While in the kafka-lag-exporter/application.conf directory, there is a configuration for this application:?

kafka-lag-exporter {
  port = 9999
  client-group-id = "kafkaLagExporter"
  lookup-table-size = 120
  poll-interval = 60

  clusters = [
      {
      name = "dev-cluster"
      bootstrap-brokers = "kafka-1:29092,kafka-2:29092,kafka-3:29092"

      admin-client-properties = {
          client.id = "admin-client-id"
          security.protocol = "PLAINTEXT"
      }

      consumer-properties = {
          client.id = "consumer-client-id"
          security.protocol = "PLAINTEXT"
      }
  }
 ]
}        

The Grafana panel for kafka-lag-exporter is in the Kafka Lag Exporter.yaml?file. It is automatically loaded at the start of the sample environment and is available at: https://localhost:3000/dashboards?


4.3. Kafka UI

The last demonstrated tool is Kafka UI:?https://github.com/provectus/kafka-ui. It allows for checking the cluster's state, configuration and browsing and searching for messages stored in Kafka brokers. This is essential during development and incident analysis in the production environment.

In our demonstration environment, Kafka UI is available at:?https://localhost:8080?(login and password admin:admin)?

And that's it - you now know how to monitor Kafka closely, like Big Brother! ????


  • If you would like to try Flowee, where we use this technology to manage business processes, get in touch with us.

  • Or perhaps you are looking for experts in distributed architectures who can write code as efficiently as Franz Kafka wrote stories, letters, and novels. Contact us!.?



[1] Franz Kafka:?The Trial, Penguin Random House (German title:?Der Prozess, 1925). English translation:??Willa?and?Edwin Muir / Breon Mitchell /?David Wyllie /?Mike Mitchell /?Idris Parry /?Susanne Lück and Maureen Fitzgibbons.

[2] The name Apache Kafka truly comes from Franz Kafka. Jay Kreps liked the writer's work and believed that, since the system is optimised for writing, it should be named after someone who writes in natural language.


Author: Grzegorz Kubiak

Editing: Agata Krajewska , Weronika Dyl?g


要查看或添加评论,请登录

Finture的更多文章

社区洞察

其他会员也浏览了