登录查看更多内容

What is Apache Kafka?

Omar Ismail

Senior Software Engineer @ Digitinary | Java 8 Certified? | Spring & Spring Boot?????? | AWS? | Microservices ?? | RESTFul Apis & Integrations ?? FinTech ?? | Open Banking ?? | Digital Payments and Transformation??

发布日期: 2023年5月3日

+ 关注

Reference : https://www.qlik.com/us/streaming-data/apache-kafka

What is Apache Kafka?

Apache Kafka is an open-source distributed event streaming platform which is optimized for ingesting and transforming real-time streaming data. By combining messaging, storage, and stream processing, it allows you to store and analyze historical and real-time data.

Why Kafka Matters

The platform is typically used to build real-time streaming?data pipelines ?that support streaming analytics and mission-critical use cases with guaranteed ordering, no message loss, and exactly-once processing.

Apache Kafka is massively scalable because it allows data to be distributed across multiple servers, and it’s extremely fast because it decouples data streams, which results in low latency. It can also distribute and replicate partitions across many servers, which protects against server failure.

How Apache Kafka Works

At a high level, Apache Kafka allows you to publish and subscribe to streams of records, store these streams in the order they were created, and process these streams in real time.

Now let’s dig a bit deeper.

Running on a horizontally scalable cluster of commodity servers, Apache Kafka ingests?real-time data ?from multiple "producer" systems and applications—such as logging systems, monitoring systems, sensors, and IoT applications—and at very low latency makes the data available to multiple "consumer" systems and applications such as?real-time analytics .

Consumers can range from analytics platforms to applications that rely on real-time data processing. Examples include logistics or location-based micromarketing applications.

Here we define the terms shown above:

Producer:?Client application that push events into topics

Cluster:?One or more servers (called brokers) running Apache Kafka.

领英推荐

Kafka Concepts

?? Saral Saxena ?????? 2 个月前

A Comprehensive Overview Of Apache Kafka

InRhythm 1 年前

The hidden costs and risks of implementing Kafka for…

Kees van Boekel 2 个月前

Topic:?The method to categorize and durably store events. There are two types of topics: compacted and regular. Records in compacted topics do not expire based on time or space bounds. Newer topic messages update older messages that possess the same key and Apache Kafka does not delete the latest message unless deleted by the user. For regular topics, records can be configured to expire, deleting old data to free storage space.

Partition:?The mechanism to distribute data across multiple storage servers (brokers). Messages are indexed and stored together with a timestamp and ordered by the position of the message within a partition. Partitions are distributed across a node cluster and are replicated to multiple servers to ensure that Apache Kafka delivers message streams in a fault-tolerant manner.

Consumers:?Client applications which read and process the events from partitions. The Apache Kafka Streams API allows writing Java applications which pull data from Topics and write results back to Apache Kafka. External stream processing systems such as Apache Spark, Apache Apex, Apache Flink, Apache NiFi and Apache Storm can also be applied to these message streams.

Apache Kafka Benefits

Apache Kafka’s message broker system can sequentially and incrementally process a massive inflow of continuous data streams that are simultaneously produced by thousands of data sources with high throughput and durability. The?data integration ?benefits are:

Scalability

By dividing a topic into multiple partitions, Apache Kafka provides load balancing over a pool of servers. This allows you to scale production clusters up or down to fit your needs and to spread clusters across geographic regions or availability zones.

Speed

By decoupling data streams, Apache Kafka is able to deliver messages at network limited throughput using a cluster of servers with extremely low latency (as low as 2ms).

Durability

Apache Kafka makes the data highly fault-tolerant and durable in two main ways. First, it protects against server failure by distributing storage of data streams in a fault-tolerant cluster. Second, it provides intra-cluster replication because it persists the messages to disk.

Apache Kafka Challenges

While Apache Kafka can be a powerful addition to enterprise data management infrastructures, it also poses some challenges. Two key challenges are as follows:

1) The need for IT teams to work with yet another set of APIs.?There are five main types of APIs:

Admin APIs?allow for managing brokers, topics, and other objects.
Producer APIs?allow applications to publish streams of records to a topic.
Consumer APIs?allow applications to subscribe to topics and to process their streams of records.
Connector APIs?automate the addition of applications or data systems to existing topics.
Streams APIs?allow apps to act as stream processors, converting input streams to output and then producing the result in a different output topic(s).

2) Degrading the performance of source systems.?Pulling real-time data from diverse source systems can degrade the performance of those systems if not implemented correctly. Many organizations find that coupling?Qlik Replicate? ?to Apache Kafka helps negate source performance issues and accelerates data streaming from a wide variety of heterogeneous databases, data warehouses, and big data platforms.

Inesh Gunasekara

Technology Consultant

8 个月

Thank you for well explained post ??

Anil Anshu

1 年

Nice

查看更多评论

要查看或添加评论，请登录

Omar Ismail的更多文章

OAuth Grant Types (Authorization Code Grant)

2023年7月16日

OAuth Grant Types (Authorization Code Grant)

The authorization code grant type is used to obtain both access tokens and refresh tokens. The grant type uses the…
Multi-Tenant Architecture in a Nutshell

2023年1月3日

Multi-Tenant Architecture in a Nutshell

Thanks to the original writer and article :…
Microservices Communication!

2022年12月26日

Microservices Communication!

Thanks To: https://medium.com/design-microservices-architecture-with-patterns/microservices-communications-f319f8d76b71…

2 条评论
What Are the New Features of SpringBoot3 ?

2022年12月4日

What Are the New Features of SpringBoot3 ?

Thanks to : https://medium.com/javarevisited/what-are-the-new-features-of-springboot3-6ddba9af664 1.

1 条评论
OAuth 2.0!

2022年12月4日

OAuth 2.0!

Thanks to the original writer : https://medium.com/@isharaaruna OAuth2.

2 条评论
How to Draw a Technical Architecture Diagram

2022年11月27日

How to Draw a Technical Architecture Diagram

Thanks to the original writer and article : https://levelup.gitconnected.

2 条评论
Event Sourcing Versus Event-Driven Architecture

2022年11月27日

Event Sourcing Versus Event-Driven Architecture

Thanks to the original writer and article :…
Best Practices For Your API Versioning Strategy

2022年11月14日

Best Practices For Your API Versioning Strategy

API versioning is critical. But do you know all of the API versioning best practices? Is your API versioning strategy…

1 条评论
Enterprise Architecture Tools

2022年11月14日

Enterprise Architecture Tools

Thanks to the original writer and article : https://medium.com/geekculture/enterprise-architecture-tools-b8165c8c9d7…
Solution Architecture: Foundations

2022年11月14日

Solution Architecture: Foundations

Thanks to the original article : https://medium.com/@yam-yam-architect/solution-architecture-foundations-fb4af948bb02…

See all articles

What is Apache Kafka?

Omar Ismail

Senior Software Engineer @ Digitinary | Java 8 Certified? | Spring & Spring Boot?????? | AWS? | Microservices ?? | RESTFul Apis & Integrations ?? FinTech ?? | Open Banking ?? | Digital Payments and Transformation??

What is Apache Kafka?

Why Kafka Matters

How Apache Kafka Works

领英推荐

Apache Kafka Benefits

Apache Kafka Challenges

Omar Ismail的更多文章

社区洞察

其他会员也浏览了

Top 10 best tips to avoid significant costs when implementing Kafka in your enterprise

Comparing Zookeeper and KRaft in Kafka

A Guide To Apache Kafka - A Data Streaming Platform

?? Apache Kafka Internals-Part1

Kafka Alternatives

Comparing Apache Kafka and Apache Pulsar: A Comprehensive Technical-Professional Analysis

Understanding the complexities and challenges of Apache Kafka: why consultancy/support from an expert might be a good idea

Apache Kafka - Summary

Real-Time Data Streaming with Apache Kafka and Node.js: A Complete Tutorial

All about Apache Kafka – An evolved Distributed commit log

What is Apache Kafka?

Why Kafka Matters

How Apache Kafka Works

领英推荐

Apache Kafka Benefits

Apache Kafka Challenges

Omar Ismail的更多文章

OAuth Grant Types (Authorization Code Grant)

Multi-Tenant Architecture in a Nutshell

Microservices Communication!

What Are the New Features of SpringBoot3 ?

OAuth 2.0!

How to Draw a Technical Architecture Diagram

Event Sourcing Versus Event-Driven Architecture

Best Practices For Your API Versioning Strategy

Enterprise Architecture Tools

Solution Architecture: Foundations

社区洞察

其他会员也浏览了

Top 10 best tips to avoid significant costs when implementing Kafka in your enterprise

Comparing Zookeeper and KRaft in Kafka

A Guide To Apache Kafka - A Data Streaming Platform

?? Apache Kafka Internals-Part1

Kafka Alternatives

Comparing Apache Kafka and Apache Pulsar: A Comprehensive Technical-Professional Analysis

Understanding the complexities and challenges of Apache Kafka: why consultancy/support from an expert might be a good idea

Apache Kafka - Summary

Real-Time Data Streaming with Apache Kafka and Node.js: A Complete Tutorial

All about Apache Kafka – An evolved Distributed commit log