The Challenges of Event-Driven Architecture: Dealing with the Dual Write Anti-Pattern

The Challenges of Event-Driven Architecture: Dealing with the Dual Write Anti-Pattern

Contemporary applications employ Event-Driven Microservices to harness the benefits of autonomous deployment and scalability offered by Domain services while maintaining loose coupling between these services.

If your application adopts a Microservices Architecture, with each Domain service managing its own data in dedicated Datastores and communicating with other services through asynchronous means, often by emitting Domain events for activities like participating in a Saga operation (such as a long-running business transaction) or data replication across services, there is a significant likelihood that you have implemented this communication approach using the Dual Write Anti-Pattern.

When considering this pattern, whether it's something to be concerned about or not, here's a brief response for situations when you should not be concerned.


If it’s OK for your application to sometimes lose the Business domain events, causing data inconsistencies across services then you can absolutely ignore this, but if this is not the case, then you need to understand this anti-pattern well and fix it.

The Dual Write Anti-Pattern refers to a scenario in which a domain service needs to perform write operations on two distinct systems, such as data storage and event brokers, within a single logical business transaction. The goal is to achieve eventual data consistency across various services. However, there is no assurance that both data systems will always be updated successfully, or conversely, that neither will be updated during this process.

Yeah, you are thinking absolutely right — we want to achieve something like Database ACID transaction, but across 2 different kind of systems. And we cannot leverage Distributed Transaction implementation because either it is not feasible or it cannot be implemented because of inherent Scalability issues with Distributed Transaction frameworks.

Let’s understand this better with a simple use case

In the provided scenario, the business objective is quite straightforward: whenever a user publishes a Feed post, it's essential to have the Content Moderation services examine the post. If any concerns are detected, the user should receive a notification, prompting them to either delete or edit the post. The Feed Microservice is responsible for managing Feed post requests from the User Interface. It not only stores the feed post data in the Database but also triggers the publication of a FeedPosted Domain event on the Event Broker. This event serves as a signal for the Content Moderation Services to take appropriate actions.

Moreover, the developer has taken meticulous steps to ensure that this entire process appears as a unified and cohesive business transaction. The pseudocode snippet below illustrates this approach:

(Original pseudocode or implementation details can be provided if needed.)

try
{
  Start DB Transaction
  Write into DB
  Push Event to Event Broker
  Commit DB Transaction
}
catch()
{
  Rollback DB Transaction
}        

In the provided pseudocode, the following scenarios ensure expected behavior:

  1. When both the Database and Event Broker are functioning correctly, data is successfully written to both systems.
  2. In the event of an error occurring during the write operation to the Event Broker, causing the catch() block to be executed, data is not written to either of the systems.

Only in the edge case, when the Database transaction commit fails (and it can very well fail by the way), the requirement is not met. Event is written to Event Broker but Data is not saved in the Database.

And this could very well lead to a User Experience or Reliability issue where the user was prompted with an error on the User Interface that Feed Post could not be saved successfully and an email was sent to the user asking to delete the post or update the post because Content Moderation service did not find the feed post appropriate.

So, what do we do now to handle the situation? One of the solution we ruled out was — leveraging Distributed Transaction. So, what next? Here are some of the possible options


Approach 1 — Publish the event after data is saved into the Database

In this scenario, after data has been successfully written to the database, the service tries to also write it to the Event Broker. Ideally, this works smoothly, but if it fails due to any reason, you can store the event in a persistent storage, which might even be the same database. Then, you can set up a scheduled task (like a Cron Job) to periodically retry publishing the event to the Event Broker. While this approach seems logical, it does have some drawbacks.

  1. This approach could potentially lead to problems with the sequencing of domain event publication. For instance, if the publishing of a "Create Feed Post" event fails, but a user successfully deletes the same feed post, causing it to be sent to downstream systems, you'll encounter a scenario where a "FeedDeleted" event is dispatched first, followed by a "FeedCreated" event, which might be sent by a Cron Job at a later time. Such a scenario has the potential to create data consistency problems. Therefore, if maintaining a specific order of events is a crucial requirement for your system, this approach may not be suitable.
  2. If an event that is supposed to be published at a later time cannot be stored in durable storage due to certain issues, there is a risk of losing those events. Another approach is to keep a marker in the business record within the database table to indicate whether the event has been synchronized. However, this approach essentially ties your event publishing requirements to the primary business entity, which may not be ideal.

Approach 2 — Use Outbox Pattern

One of the recommended strategies for managing the Dual Write Anti-Pattern involves a two-step process. In this approach, a service first stores the business data in the database within a single database transaction. Simultaneously, it also records the event that needs to be published in a separate table known as the Outbox Table. This approach capitalizes on the ACID properties of the database, ensuring that the business data is saved in the database as part of a unified transaction.

However, the event intended for publication to the Event Store is not immediately published at this point. Instead, an external process is responsible for reading the records from the Outbox Table. Subsequently, it publishes the event to the Event Store. This process ultimately leads to the achievement of eventual data consistency and effectively addresses issues associated with the Dual Write problem.

With this approach -

  1. There is a guarantee that events will be published eventually to the Event Store
  2. Will never be lost, even if Event Store is not available at the time of publishing the event
  3. Ordering of the events can be ensured
  4. Requirement of insert

But these benefits does not come for free

  1. You need to put additional efforts to write this external processor which reads the data from Outbox Table and publishes to Event store
  2. This external component also becomes the Single Point of Failure, hence needs great monitoring and automated corrective measures to handle the failures should something goes wrong

Here is a pictorial representation of this approach

There are different ways by which we can implement the Outbox pattern and some of the design level issues which one needs to think thru in terms of

  1. If a service happens to publish multiple domain level entities, then would I need one Outbox table per Domain entity or one Outbox table per service.
  2. How would I perform clean up of the Outbox table else it will grow infinite
  3. And many more…

In my next article, I will cover the Outbox pattern implementation in detail, including the above design decisions plus something called Change Data Capture (CDC).

Hope you enjoyed reading this article, Do share this article with your friends if this has helped you in any way.

Happy article…Cheers!!!

#Microservices #MicroservicesCommunication #MicroservicesAntiPattern #DualWriteAntiPattern #MicroservicesArchitecture #EventDrivenArchitecture

Stephan Milacek

Web3 Builder & Headhunter l Helping you find the best crypto talent out there.

1 年

Thanks for sharing.

要查看或添加评论,请登录

Momen Negm的更多文章

社区洞察

其他会员也浏览了