In recent years, almost of us have been using Kafka for many use cases such as message brokers, activity tracking, and event sourcing. Have you ever been curious about why so many companies even big ones choose Kafka?
This series will explore the motivations behind Apache's development of Kafka and how its design addresses these challenges. In the first article, we will focus on Kafka's solutions in data storing, caching, and transferring.
According to Apache, Kafka is designed as a unified platform for handling all the real-time data feeds a large company might have. This goal requires the following requirements:
- High-throughput to support high-volume event streams such as real-time log aggregation.
- Gracefully with large data backlogs to support periodic data loads from offline systems
- Low-latency delivery to handle more traditional messaging use cases.
- Guarantees fault tolerance in the presence of machine failures
Kafka stores messages in append-only logs, and their contents are immutable. This approach improves throughput and can deal gracefully with large data backlogs.
- Sequential Disk Access: Modern disks are extremely efficient at sequential reads and writes (hundreds of MB/s). Kafka uses a log-structured approach instead of random seeks. For example, a 7200rpm SATA RAID-5 array performs about 600MB/sec for sequential writes but only 100k/sec for random writes—this is a 6000x difference.
- Pagecache and Memory: Kafka?writes data to a persistent log?immediately and relies on the operating system's?page cache?rather than maintaining large in-memory caches. This ensures that?data persists?(even after a restart), avoids slow rebuilds of the in-memory cache, and allows better?cache coherency?handled by the OS.
- Message Retention: Kafka’s persistent logs design allows it to retain messages for extended periods (e.g., a week) instead of deleting them immediately after consumption.
- Simplify Data Structures: Kafka’s append-only log (O(1) operations) bypasses seek time issues from complex data structures like BTREE. It provides stable performance even as data volumes grow.
Kafka uses several techniques to speed up data transfer like batching, compression, common file format, and especially Zero-copy. This helps Kafka achieve high throughput and low latency delivery.
- Batching I/O: Grouping messages together into sets reduces the overhead of numerous small I/O operations and network roundtrips. It leads to larger network packets, larger sequential disk operations, and contiguous memory blocks.
- Minimizing Byte Copies: A standardized binary message format shared by producers, brokers, and consumers avoids unnecessary copying. Data chunks can be transferred without modification.
- Zero-Copy Optimization: Kafka applies zero-copy optimization to transfer data directly from the OS page cache to a network socket, avoiding redundant data copies and system calls for user-space buffers and socket buffers.
- End-to-End Batch Compression: Kafka optimizes network bandwidth by supporting batch compression. Instead of compressing individual messages, Kafka compresses batches of messages together, improving compression ratios and reducing network load.
Kafka's design prioritizes high throughput, persistent, low latency, and fault tolerance. The log-centric architecture and sequential disk access help it utilize modern disk's read, write, and cache operations. Combined with data transfer techniques like zero-copy and batch compression, Kafka can handle enormous streams of events efficiently and reliably.
In the next article, we will explore Kafka's solutions for message brokers, distribution, replication, and fault tolerance.
Application Architect, Microservices, Java, Kafka, ML, Docker/Kubernetes, Azure, AWS, PCF, GCP
3 个月To investigate why Kafka solution failing we can monitory the performance of our Kafka topics and track the flow of data. We can use updated connectors. I suggest use Kafka Tools (KT) for live monitoring, here is the link https://kafkatools.com/
?Backend Developer
3 个月Good to know !