Microservices: Consistent State Propagation with Debezium Engine
This article outlines the main challenges of state propagation across different (micro)services together with architectural patterns for facing those challenges, providing an actual implementation using Debezium Engine.
But first of all, lets define some of the basic concepts that will be used during the article.
Setting the Stage
As playground for this article, we will use two bounded context defined as part of an e-learning platform:
Between both contexts there is an asynchronous, event-driven, relationship, so that whenever a course is created in the Course Management bounded context, an event is published and evententually received by the Rating System, which adapts the inbound event to its internal domain model using an anti-corruption layer (ACL) and creates an empty rating for that Course. Next image outlines this (simplified) context mapping:
The Challenge
The scenario is apparently pretty simple, we can just publish a message (CourseCreated) to a message broker (e.g. RabbitMQ) whenever a Course is created, having the Rating System subscribed to that event, it will be eventually received and processed by the downstream service. However, it is not so simple and we have several "what-if" situations, like:
In a nutshell, the main challenge here is how to ensure, in a distributed solution, that both, the domain state mutation and the corresponding event publication happens in a consistent and atomic operation, so that both or none should happen.
There are certainly solutions that can be implemented and message producer or message consumer sides to try to solve these situations, like retries, publishing compensation events, manually revert the write operation in the database...
However, most of them requires the software engineers having too many scenarios in mind, which is error prone and reduces codebase maintainability.
Another alternative is implementing a 2-phase-commit solution at infrastructure level, making more complex the deployment and operations of the underlying infrastructure, and most likely forcing the adoption of expensive commercial solutions.
During the rest of the article we will focus on a solution based on the combination of two important patterns in distributes systems: Transactional Outbox and Change Data Capture, providing a reference implementation that will allow software engineers to focus on what really matters: Providing domain value.
Patterns Can Help
As described above, we need to ensure that state mutation and the publication of a domain event are atomic operations. This can be achieved by the combination of two patterns which are nicely explained by Chris Richardson in his must-read Microservices Patterns book, therefore I will not explain them in detail here.
Transactional Outbox Pattern
Transactional Outbox focuses on persisting both state mutation and corresponding event(s) in an atomic database operation. In our case, we will leverage ACID capabilities of a relational database, with one transaction that includes two write operations, one in the domain specific table, and another that persists the events in an outbox (supporting) table.
This ensures than we will achieve a consistent state of both domain the corresponding events. This is shown in the next figure:
Change Data Capture with Trasaction Log Tailing
Once the events are available in the Oubox table, we need a mechanism to detect new events stored in the outbox (Change Data Capture - CDC) and publish them to external consumers (Transaction Log Tailing Message Relay).
The Message Relay is responsible for detecting (i.e. CDC) new events available in the outbox table and publish those events for external consumers via message broker (e.g. RabbitMQ, SNS + SQS) or event stream (e.g. Kafka, Kinesis).
There are different Change Data Capture (CDC) techniques for the Message Relay to detect new events available in the outbox. In this article we will use the Log Scanners approach, named Transaction Log Tailing by Chris Richardson, where the Message Relay with be tailing the database transation log to detect the new messages that have been appended to the Outbox table. I personally prefer this approach since it reduces the amoung of manual work, but might not be available for all databases.
Next image illustrates how the Message Relay integrates with the Transactional Outbox:
Once of the main goals of this solution, is ensuring that the software engineers of the two bounded context only need to focus on the elements with orange color in the diagram above, the grey components are just infrastructure elements that shall be transparent for the developers.
So, how do we implement the Transaction Log Tailing Message Relay? Debezium Engine is the answer.
Debezium
Debezium is a Log Scanner type change data capture solution that provides connectors for several databases, creating a stream of messages out of the changes detected in the database's transaction log. Debezium comes with two flavors:
In this example we will use Debezium Embedded, due to its simplicity (i.e. no Kafka instance is needed) but at the same time it is robuts enough to provided a suitable solution.
The first time a Debezium instance starts to track a database, it takes an snapshot of the current data to be used as a basis, once completed, only the delta of changes from the latest stored offset will be processed.
Debezium is highly configurable, being possible to shape its behaviour to meet different expectations, allowing for instance to define:
Some of this properties will be analyzed later in the article.
All Pieces Together - Deployment Architecture
All the components will be deployed as Docker containers in Docker Compose. Those components requiring durable storage will use a Docker volume.
The overall (simplified) deployment architecture is shown below:
Show Me The Code
All the code described in this article can be found in my personal GitHub repository.
Overall Project Structure
The code provided to implement this example is structured as a multi-module java maven project, leveraing Spring Boot and following hexagonal-architecture like structure.
There are three main package groups:
Toolkit Supporting Context
Modules providing shared components and infrastructure related elements used by the functional bounded contexts (in this example Course Management and Rating System). For instance, the transactional outbox and the Debezium-based change data capture are shared concerns and therefore their code belongs to these modules.
Where:
Course Management Bounded Context
These modules conform the Course Management bounded context. The module adhere to hexagonal architecture principles, similar to the structure already used in my previous article about repository testing.
Where:
Rating System Bounded Context
For the sake of simplicity, this bounded context is partially implemented with only an inbound AMPQ-based adapter for receiving the messages created by the Course Management service when a new course is created and published by the CDC service (toolkit-state-propagation-debezium-runtime)
领英推荐
Where:
Request Flow
This section outlines the flow of a request for creating a course, starting when the user requests the creation of a course to the backend API and finalizing with the (consistent) publication of the corresponding event in the message broker.
This flow has been splitted in three phases.
Phase 1: State Mutation and Domain Events Creation
Starts with the request for creating a new Course Definition. The HTTP post request is mapped to a domain command and processed by its corresponding command handler defined in course-management-application. Command handlers are automatically injected in the provided CommandBus implementation, in this example the CommandBusInProcess defined in the toolkit-core module:
The command handler creates an instance of the CourseDefinition entity. The business logic and invariants (if any) of creating a Course Definition are encapsulated within the domain entity. The creating of a new instance of the domain entity also comes with the corresponding CourseDefinitionCreated domain event:
The event is "recorded" into the created course definition instance. This method is defined in the abstract class Entity of the toolkit-core module:
Once the course definition instance is in place, the command handler will persists the instance in the course definition repository, starting the second phase of the processing flow.
Phase 2: Persisting the State: Course Definition and Events in the Outbox
Whenever a domain entity is saved in the repository, the domain events associated to the domain state mutation (in this example, the creating of a CourseDefinition entity) are temporarly appended to an in-memory, ThreadLocal, buffer. This buffer resides in the DomainEventAppender of the toolkit-core.
Events are placed in this buffer from an aspect executed around methods annotated with AppendEvents. The pointcut and aspect (both in toolkit-core) look like:
The command handlers are automatically decorated before "injected" in the command bus. One of the decorator is ensuring the command handlers is transactional, and another ensures when the transaction completes the events in the in-memory-thread-local buffer are published to the outbox table consistently with the on going transaction. Next sequence diagram shows the decorators applied to the domain-specific command handler.
The outbox is an append only buffer (as a postgres table in this example) where a new entry for each event is added. The outbox entry has the following structure:
Where the actual event payload is serialized as a json string in the payload property. The outbox interface is straightforward:
The postgres implementation of the Outbox interface is placed the toolkit-outbox-jpa-postgres module:
The phase 2 is completed, now both the domain state and the corresponding event are consistently persisted, under the same transaction, in our postgres database.
Phase 3: Events Publication
In order to make the domain events generated in the previous phase available to external consumers, the message relay implementation based on Debezium Embedded is monitoring the outbox table, so that whenever a new record is added to the outbox, the message relay creates a Cloud Event and publishes it to the RabbitMQ instance following the Cloud Event AMQP binding specification.
The following code snippet shows the Message Relay specification as defined in the toolkit-core module:
And the Debezium Embedded based implementation:
As can be seen in the code snippet above, the DebeziumEngine is configured to notify a private method handleChangeEvent when a change in the database is detected. In this method a Consumer of CdcRecord is used as a wrapper of the internal Debezium model represented by the Struct class. Initial configuration must be provided to the Debezium Engine, in the example this is done witht he DebeziumConfigurationProvider class:
The most relevant properties are outlied below:
The first thing Debezium will do after starting is taking a snapshot of the current data and generating the corresponding events. After that the offset is updated to the latest record and the deltas (newly added, updated or deleted records) are processed, updating the offset accordingly. Since in the provided example we are using an in-memory based offset store, the snapshot is performed always after starting the service. Therefore, this is not a production ready implementation yet, there are two options:
The Message Relay is configured and initialized in the toolkit-state-propagation-debezium-runtime module. The values of the configuration properties needed by Debezium Embedded are defined in the Spring Boot properties.yaml file:
The engine is started using the Spring Boot lifecycle events:
The change data capture records (CdcRecord) are processed by the CloudEventRecordChangeConsumer which creates the Cloud Event representation of the CDC record and publishes it through the MessagePublisher.
The provided MessagePublisher is a simple RabbitMQ outbound adapter converting the Cloud Event to the corresponding AMQP message as per the Cloud Event AMQP protocol binding.
After Phase 3, the events are published to the producer's (Course Management Service) message relay RabbitMQ exchange defined under the convention __MR_<APP_NAME>, in our example: __MR_course-management, messages are routed to the right exchange based on a custom Cloud Event extensions as shown in the previous code snippet.
Alternative Solutions
This example makes use of Debezium Embedded for providing a change data capture and message relay solution. This works fine for technologies supported by Debezium through its connectors.
For non-supported providers, alternative approaches can be applied:
Adding any of those alternatives in this example, would imply just providing specific implementations of the MessageRelay interface, without additional changes in any of the other services.
Conclusion
Ensuring consistency in the state propagation and data exchange between services is key for providing a reliable distributed solution. Usually this is not carefully considered when designing a distributed, event-driven software, leading to undersired states and situations especially when those systems are already in production.
By the combination of the transactional outbox pattern and message relay, we have seen how this consistency can be enforced, and by using hexagonal architecture style, the additional complexity of implementing those pattern can be easily hidden and reused across bounded context implementations.
The code of this article is not yet production ready, concerns like observability, retries, scalability (e.g. with partitioning), proper container orchestration, etc. are still pending. Subsequent articles will go through those concerns using the provided code as a basis.
Remote Database performance and HA expert for Postgres & MySQL | I help your company scale to thousands of users ?? keep existing users ?? & protect their data ??? #Postgres #PostgreSQL #MariaDB #MySQL #DBA #Freelance
9 个月sounds like a fun read. leveraging debezium engine is clutch for consistency. David Cano
Creating affordable websites for business owners to enhance their online presence ?? Specializing in architects ??? | Co-Founder of @FlowPhoenix | Webflow developer and designer ??
9 个月Great insights on state propagation in microservices! Debezium Engine and DDD concepts are invaluable.