Building a robust streaming platform is complex and entails making many design decisions. One critical decision is whether to have an API abstraction over the streaming technology (Kafka, Pulsar, Redpanda etc.). The disadvantages of having an API abstraction are straightforward – just another abstraction, another hop, and dependency on another team. While the disadvantages of having an API abstraction may seem obvious, the advantages of API abstraction are non-obvious. Here are some key reasons API abstraction over Kafka is critical, especially for large-scale streaming platforms.
Note: I have used Kafka and streaming technology as interchangeable terms.??
- Abstraction over streaming technology: Often, enterprises are stuck with a particular streaming technology because migrating clients to newer technology or version of the same technology is costly and disruptive to operations. The switching cost (stickiness) vs strategic optionality on ease of switching should weigh in to the calculations.
- Abstraction over topic location: As Kafka scales in the organization, there is a need to split one Kafka cluster into more isolated and federated Kafka environments. With an abstraction layer, the consumer does not incur any costs due to splitting/reorganization/redistribution of the Kafka topics in the federated clusters.??
- Re-use of critical engineering assets: Many client engineering teams will argue connecting to streaming technology (Kafka) is relatively easy and not having any abstraction will provide more control to clients. While this argument is valid in smaller companies, it differs in large enterprises. Hundreds of publishers in large enterprises will have varying skill sets. The central streaming technology team will spend an enormous amount of time helping clients connect to Kafka to increase the resiliency of connection(s) in case of the failure of nodes. Having one API abstraction will centralize the effort in building a resilient, reusable asset, and all the client teams can contribute towards the abstraction. The cost implication of handholding many engineering teams is a tremendous burden if not handled through an API abstraction
- Adoption: An API interface also makes the streaming technology language independent, and publishers can write in any language, including clients who cannot include a SDK or library for example JavaScript Apps.?
- Resiliency: An API interface can increase the end-to-end availability by switching to a secondary Kafka cluster if the primary Kafka cluster is down.?
- Schema registration and validation: Every event consumer expects a valid schema, data, and format. Valid in this context means publication to the specification defined by the publisher. Consumers should not have to verify for correct schema, data, and format. With an API abstraction, the validation of schema and format is possible. (more on how in later writings).
API abstraction over Kafka is an excellent decision in scaling the streaming platform, adoption of streaming platforms, cost reduction, and strategic optionality. If one searches the internet about the API abstraction pattern, one will find quite a few articles describing the abstraction as an anti-pattern. Here are some of the engineering teams that have adopted this pattern successfully.
Please let me know if I missed pro/cons of having an API abstraction over Kafka.
Client Services @ National Stock Exchange IT
1 年Seshu, thanks for sharing!
Transformation Technologist | Cloud Migration Expert | Strategic Partnering Leader | Author of "Building and Delivering Microservices on AWS"
2 年How about the abstractions and benefits provided by open source API's like spring for Kafka , then why there is a need to bundle another abstraction which brings in another point of failure and additional compute cost for the organization ? I have even seen organizations using the databases to persist the kafka messages as they believe that their message shouldn't be lost, but in my opinion that is not a correct use case to either use the Kafka or something wrong with consumers which aren't able to process the message within retention period.