Topic, Partition, Offset and Broker in Apache Kafka

Topic, Partition, Offset and Broker in Apache Kafka

Topic, Partition and Offset

  • Topic is a place to store messages, messages are contained within partitions, which are parts of the topic, and each message within a partition is identified by a unique offset index.
  • When producer send a new message to the topic. Apache Kafka appends the new messages to the last offset index in order.
  • A consumer, who subscribe to a topic, pull message from the partition in the same order as they are stored.

  • Messages in Kafka are retained for a default period of 7 days before they are deleted. However, it's essential to note that even after deletion, the order of the remaining messages remains unchanged.
  • Topics can have multiple partitions, allowing for horizontal scaling and efficient distribution of data. Producers and consumers from different applications can interact with these partitions simultaneously, enabling a flexible and versatile messaging system.

Kafka broker

  • Messages are stored at offsets within partitions, which are in turn stored within topics.
  • These topics, along with their partitions and messages, are stored on disk and managed by servers. In Kafka, these servers are known as brokers and form a Kafka cluster.
  • Deploying everything on a single server poses a risk of single-point failure. If this server fails, the entire system shuts down.
  • That's why it's common practice to implement Kafka with multiple nodes across multiple servers, creating a replicated cluster to ensure system reliability and fault tolerance.
  • When replication is enabled, Kafka designates one copy, known as the "active copy", as the leader, while the other copies are designated as "followers". When a producer writes new data to a topic, it is first written to the leader node within the Kafka cluster and then replicated to all follower nodes.

Summary

  • Apache Kafka utilizes topics with partitions to organize messages, each identified by a unique offset.
  • Producers add messages to topics, while consumers retrieve them in sequence from partitions.
  • Message retention is set to 7 days, ensuring data persistence.
  • Kafka's distributed architecture with multiple brokers ensures fault tolerance, employing replication to designate leaders and followers for data redundancy.

要查看或添加评论,请登录

Nguy?n Tu?n D??ng的更多文章

社区洞察

其他会员也浏览了