On the peril of confusing Streams and Messages - Part 1
Over the last 6 months or so I've been part of many customer and partner workshops where the key topic was to review, re-engineer or to design from scratch a data and app architecture, cloud prime of course, with a particular twist towards data flow, event sourcing, and data integration.
Truly to my surprise, a common misconception revolves around message based architectures and event streaming and processing. Or at a different scale, the way one implements event-driven architectures.
I am not a purist. I don't fight religious wars. I am an outcome based buddy. To say, that I am not going to pontificate about the best way to design an event-driven architecture. I just want to humbly draw a few distinctions that I naively thought they were "established" but from my experience I gather they are often not.
The Problem
Let's get started with an example. There are a number of use cases implemented in the so-called "Smart City" context. For instance, smart waste management, smart car parking lot, smart community, etc. (everything is smart these days).
The diagram below is a very simplified perspective that illustrates just three categories of use cases.
The use cases above are all centred around a sort of "messaging" intelligence. This messaging construct can however be broken down in some key parts which are often overlooked. Actually,
I believe the root of confusion between streaming and messaging based architecture lies in the broad definition of "Message".
Admittedly, there are a few overlaps between messages and streaming, but the differences are more important than the commonalities!
Conceptual clarification
Hopefully no academic stuff but hey concepts have consequences! If you believe that requesting a purchase order to be processed is an event and handled accordingly, you may end up building a weak, sub-optimal architecture (to say the least). Still can't you see why? Read on.
Things happen and generate facts. These facts can take on three different shapes: Message, Discrete Event, Stream of Events.
Messages are facts that convey an expectation. As an analogy, think of messages as parts of a "conversation": they express an intent to get something done, and "the message is the content", namely they carry the payload e.g. the actual order document to be processed. Let's sum up a few nice properties:
- Payload included in the message i.e. the actual event is encoded
- Various type of messages exists such as "Command", "Request", "Update", "Query", etc but all expect some sort of acknowledge
- Message are transient, once consumed they disappear
On the other hand, Events are announcements of facts that have happened but have no expectations that the event will result in any action. They just report out what happened. In fact, they are very much time-aware, which proves to be one of the key traits here. In short:
- Events are facts, immutable, ordered, and lightweight
- They don't carry any expectations on how they will be used i.e. strict decoupling between producers and consumers (no contract among the parties)
- They persist and can be replayed (rewind/forward) like recorded on a traditional tape cassette
- There are two categories of events, Discrete, and Continuous.
Discrete events are things like an alarm that goes off, or the fact that a person has entered the building.
Continuous events are raised continually (stream) such as the tracking of shoppers in store, or telemetry such as pollution level, temperature, humidity, and light, to assures the operation of streetlights for instance, etc.
Needless to say, Messages, Discrete Events, and Streaming of Events necessitate different technologies and architectural patterns.*
Let's take a closer look.
Key Enabling Technologies
Fundamentally, messages and events are handled by a broker however very different in nature.
A traditional message broker is a essentially a queue implemented on a bunch of servers optimised for handling message streams. They are not designed for long-term storage but for transient messages. In fact, after a message has been consumed (upon an acknowledgement), the broker delete it (so keeping the size of the queue manageable in memory).
A few consequence of this paradigm:
- Consumers are generally asynchronous: upon message delivery to the broker, producer's wait is over i.e. it does not wait for the message to be processed.
- Keeping messages in memory by default and only write them to disk if the queue grows too large: such systems are fast when queues are short and become much slower when they start writing to disk.
- New consumers only starts receiving messages sent after the time they were registered; any prior messages are already gone and cannot be recovered.
- Particular care needs to be taken to delete queues whose consumers have been unregistered, on the pain of accumulating unnecessary messages and taking away memory from consumers that are still active.
This is a traditional view of message brokers encapsulated in standards such as JMS and AMQP, and implemented by for example the likes of ActiveMQ, IBM MQ, Google Cloud Pub/Sub, and naturally Azure Service Bus.
What if we could combine the low-latency notification facilities of messaging with the durable storage approach of classical databases? Please enter partitioned logs.
I call it the revenge of THE LOG! Neglected, kicked about, ridiculed - this little thing called log has run the full gauntlet. But the time came when the world suddenly longed for it. It is now "doing streams" as easy as falling off a log (pun intended ;-)).
A broker implemented as a log is not just an alternative implementation technique but a whole new way to stream data. In fact, a log is an append-only data structure to store immutable events. A producer sends events to the log which are appended at the end of it, and the consumers read them from the last one.
The log is on disk and behaves as a circular buffer, so old messages can get deleted - though the disk space is quite large
Essential to log is the concept of partitions. It is basically a mechanism to have multiple independent logs with their own data events hosted on different machines. This enables right off the bat higher scalability. In addition, partitions can be grouped together defining a "context" through topics, carrying events of the same type. Note that within a partition messages are ordered but across partitions the order is not guaranteed.
The following picture borrowed by (**) illustrates this paradigm even further:
A few implications worth mentioning:
- No event gets deleted (within the retention period defined). Thus the broker does not need to track acknowledgments for every single message— just the offset which is a monotonically increased sequence number.
- If a consumer node fails, another node in the consumer group is assigned the failed consumer’s partitions, and it starts consuming messages at the last recorded offset.
- It natively supports a fan-out model, meaning that consumers are independent and scale independently This fact is a big operational advantage: you can experimentally consume a production log for development, testing, or debugging purposes, without having to worry much about disrupting production services. When a consumer is shut down or crashes, it stops consuming resources—the only thing that remains is its consumer offset (compare this with the last bullet point above regarding the messaging)
- The events in the stream are analysable: You can process one or more input streams to produce one or more output streams. Streams may go through a pipeline consisting of several such processing stages before they eventually end up at an output that maybe a user or an application. In the next instalment we will take a closer look at this important "event processing" technologies.
- Replay: the offset is under the consumer’s control, so it can easily be manipulated if necessary: for example, you can start a copy of a consumer with yesterday’s offsets and write the output to a different location, in order to reprocess the last day’s worth of messages. You can repeat this any number of times. Or you can reconstruct the state by replaying events to make it suitable for applications e.g. for analytics purposes it is useful to know how the shopping cart came to have the current state.
The last point has an organizational impact that I will explore in future posts.
Apache Kafka, Amazon Kinesis Streams, Azure Event Hubs are log-based broker that implement this logic, achieving throughput of millions of messages per seconds by partitioning across multiple machines, and fault tolerance by replicating messages.
Conclusions
More should be said but the intention was to simply provide an exact definition of what messages are and what events are. This proves to be a good compass to orient the decision making process.
An interesting complement is to have a tool to navigate through the various solutions from a technical perspective. For example, it would be handy to know that in situations where messages may be expensive to process and you want to parallelize processing on a message-by-message basis, and where message ordering is not so important, the JMS/AMQP style of message broker is preferable. Good news is that such a tool exists, it is much broader than streaming, and is evolving very rapidly. Here is the link , while a short video was recorded not long ago.
Next up in the series, I will touch on discrete events and stream processing, along with related technologies. Finally, we will pull all pieces together to wrap this up.
Thanks for bearing with me.
. . .
* My colleague Clemens Vasters, has a vast and deep knowledge about messaging. These definitions are borrowed by his excellent work. Here an extremely useful overview.
** Martin Kleppmann, "Designing data-intensive applications" is a great source and I thank him a lot for having written such a useful book.
Retired from big tech. Not retired from riding the Architect Elevator to make IT and architecture a better place. Have opinions on EA, platforms, integration, cloud, serverless.
2 年I like the notion of discrete and continuous events. However, I consider them both kind of messages as messaging defines the interaction style