登录查看更多内容

Kafka vs. Pulsar

Eran Shaham

Architect and a Big-Data leader at Ford Research Center Israel LTD

发布日期: 2019年3月28日

Kafka is here for a long time. Perhaps too long...

I bumped into this article (titled: Pub/sub messaging: Apache Kafka vs. Apache Pulsar) the other day. And I was thinking now to put a few words of my experience into this battle field.

So, let's get down to business...

Kafka pros:

- It's very mature with a very rich and useful documentation.
- As it's here for long, a mature and an extensive community of active users
- Kafka Streams.
- Seems like simpler to operate in production - less components as broker node provides storage.
- A kind of transactions.
- Offsets are provided, so you have the flexibility of fetching messages (yet, you can't fetch a specified message).

Kafka cons:

- Consumer can't acknowledge message from a different thread.
- No multitenancy.
- No robust Multi Datacenter replication - yet, offered in Confluent Enterprise.

Pulsar pros:

- Features rich – persistent and non-persistent topics, multitenancy, ACLs, Multi Datacenter replication, and more.
- A more flexible client API that includes CompletableFutures, fluent interfaces and more.
- For those that work multi-threaded, the java client components are a thread safe - consumer can acknowledge messages from different threads.
- It seems like it's a bit easier to use. In Kafka, the broker is dumb and the consumers do the job of structuring communications as they see fit. This flexibility comes at the price of the user of Kafka having to understand how to make the pieces fit together.
- You can do things that are not easily done, or maybe impossible in Kafka, such as, multi-tenancy (for security, and isolation), resource management (for topic throttling and quotas), geo-replication.
- It has some features that Kafka currently lacks, like seeking to a particular via MessageId (yet you are lucking offsets).
- Pulsar scales to millions of topics, which Kafka is limited by the way it structures data in Zookeeper.
- Easier deployment. A standalone Pulsar will start its own local Zookeeper. No need to start it manually.
- It's written in Java, Kafka on the other hand, is a mix of Scala and Java code.

Pulsar cons:

- In terms of documentation, the java client has little to no documentation.
- A small community, a plenty room to grow.
- Thought very useful (for instance, MessageId can also be stored outside Pulsar and be used to rollback to specific message), MessageId concept is heavily tied to BookKeeper - consumers cannot easily position itself on the topic compared to Kafka offset which is continuous sequence of numbers.
- Reader cannot easily read last message in the topic - need to go through all the messages to the end.
- At the moment no transactions are offered.
- More complexity as Zookeeper, Broker nodes and BookKeeper – are involved.

What you can take out of this, well this 's up to you... My thoughts are around to give Pulsar a fair chance on my next project.

Anthony Davis

Staff Customer Success Technical Architect at Confluent

5 年

No multi tenancy in Kafka? Not sure I follow. Data center replication? How about stretch clusters and mirror maker? I think you were only considering confluent replicator?

要查看或添加评论，请登录

Eran Shaham的更多文章

Microservices Chatbot and Coronavirus

2020年6月8日

Microservices Chatbot and Coronavirus

A few weeks ago I shared a short post about a new initiative of mine to have a fun bot to make life much easier in…
Docker image build vs. jib

2020年2月20日

Docker image build vs. jib

Jib is an open-source Java containerizer originally coming from Google. Jib allows to build Docker images from Java…
A JSON schema validator

2019年7月1日

A JSON schema validator

A simple JSON schema validator for the Vert.x world.

2 条评论
vertx-lucene-classification

2019年3月31日

vertx-lucene-classification

Lucene is here for a long time, ML was added to Lucene for a few releases now, yet some aspects were left out. ML can…
UMLet- an open source UML tool

2018年11月28日

UMLet- an open source UML tool

Some aspects of my day job work are drawing many diagrams. That's part of an architect role to create design documents…

2 条评论
Revive- a Single Page Application framework

2018年11月18日

Revive- a Single Page Application framework

I'm uploading a short presentation about a new open sourced Revive which I've made public. Revive is a new light open…
A few words on Docker and Kubernetes

2018年5月30日

A few words on Docker and Kubernetes

We all know Docker Engine; it’s a container runtime. We can run “docker run” on a host whether it’s a server or a VM…

2 条评论
A poor man Dependency Injection

2018年3月18日

A poor man Dependency Injection

Dependency Injection (DI) has been around for a while now. A typical use case would be, for instance, the same piece of…
Apache Storm and big data

2017年8月8日

Apache Storm and big data

A background: Big data is here for a while now. At the practical level, big data helps us to better understand our…
Cassandra VS. MongoDB

2017年7月3日

Cassandra VS. MongoDB

Cassandra and MongoDB became to be the two of the most popular NOSQL databases that are running around in the last few…

4 条评论

See all articles

Kafka vs. Pulsar

Eran Shaham

Architect and a Big-Data leader at Ford Research Center Israel LTD

Kafka is here for a long time. Perhaps too long...

Eran Shaham的更多文章

社区洞察

其他会员也浏览了

Kafka vs SQS: A Detailed Comparison

Kafka Concepts

Kafka vs. JMS: which one should you be using?

Kafka vs RabbitMQ: Biggest Differences and Which Should You Learn?

Kafka vs RabbitMQ

A Complete Guide to Apache Kafka for Developers (or, everything I know about Kafka in one place)

--- Apache Kafka vs Solace PubSub+: A Comprehensive Guide for Modern Messaging Systems

Apache Kafka: Core Concepts and Use Cases

Kafka less explored facts Part-1

Kafka's Evolution: Zookeeper vs. KRaft

Kafka is here for a long time. Perhaps too long...

Eran Shaham的更多文章

Microservices Chatbot and Coronavirus

Docker image build vs. jib

A JSON schema validator

vertx-lucene-classification

UMLet- an open source UML tool

Revive- a Single Page Application framework

A few words on Docker and Kubernetes

A poor man Dependency Injection

Apache Storm and big data

Cassandra VS. MongoDB

社区洞察

其他会员也浏览了

Kafka vs SQS: A Detailed Comparison

Kafka Concepts

Kafka vs. JMS: which one should you be using?

Kafka vs RabbitMQ: Biggest Differences and Which Should You Learn?

Kafka vs RabbitMQ

A Complete Guide to Apache Kafka for Developers (or, everything I know about Kafka in one place)

--- Apache Kafka vs Solace PubSub+: A Comprehensive Guide for Modern Messaging Systems

Apache Kafka: Core Concepts and Use Cases

Kafka less explored facts Part-1

Kafka's Evolution: Zookeeper vs. KRaft