Transactional Outbox Pattern?-?Distributed Design?Patterns

Transactional Outbox Pattern?-?Distributed Design?Patterns

As we deal with more complex distributed systems, we’ll often come across use cases where we need to atomically perform operations on multiple data sources in a consistent manner.?

So, let’s assume that we are persisting order data into an RDBMS. The ML team might want to perform some analytics on this data. So, we have the following options -

  1. Grant the ML team access to our DB. It creates a tight coupling between the Order Service and the ML team and any changes to the Order schema need to be coordinated across both teams and hence isn’t the preferred approach.
  2. The Order Service writes to another DB owned by the ML team using the 2PC protocol. 2PC protocol is not as performant because of the need to coordinate across multiple nodes and is a blocking protocol. Hence it isn’t the preferred approach.?
  3. Push the Order data onto a message broker like Kafka. The ML team can then have a consumer that reads data off Kafka and persists in their DB and perform analytics against that DB.

We’ve happily decoupled the Order Service with the Analytics Service and everyone is happy! (Or so you think!)

There are multiple failure scenarios here:

  1. Order Service successfully persisted the message on the Database but crashed before it could send the message to the Broker. This leads to a loss of messages, which means the ML team will not have all the orders to run their analytics on(making the analytics wrong/skewed).
  2. Order Service successfully sent the message to the Broker but the transaction on the Database failed. This will lead to orphaned/false records with the ML team again impacting their analytics.

Outbox Pattern

Outbox Pattern comes to the rescue here. We make use of an Outbox table, which can be used to store the operations we’re performing on the database. Order Service will write to both the Order table as well as the Outbox table, as part of the same transaction, ensuring the operation will always be atomic(1).

Once the record is inserted into the Outbox table, it can be read by an asynchronous process(2) that reads the data and publishes it to the Message Broker(3).

QQ: What does the Outbox pattern remind you of? Hint: WAL
No alt text provided for this image
Outbox Pattern


Advantages of Outbox?Pattern

The Outbox Pattern provides several benefits over other messaging patterns. Some of the major advantages of the Outbox Pattern are as follows:

  1. Reliability: With the Outbox Pattern, messages are persisted in a database transactionally with the business transaction. This ensures that messages are always delivered, even if there are system failures or network issues.
  2. Scalability: The Outbox Pattern can handle high volumes of messages without overwhelming the message broker. Since messages are persisted in the database, the message broker can consume them at a more controlled rate.
  3. Performance: The Outbox Pattern can be faster than other synchronous messaging patterns because it eliminates the need for synchronous communication between microservices. The microservice that produces the message can quickly complete the business transaction and return a response, while the message is sent asynchronously in the background.
  4. Decoupling: The Outbox Pattern allows microservices to be loosely coupled. Each microservice can focus on its specific business logic and ignore the details of how messages are sent and received.

Alternatives to Outbox?Pattern

If the Outbox Pattern is not suitable for your use case, there are a few alternative messaging patterns you can consider:

  1. Direct Messaging: This pattern involves a direct synchronous request between microservices. It can be a good option for low-latency, low-volume communication.
  2. Database Trigger: Another option is to use a database trigger to write the messages to the messaging infrastructure. The trigger can be used to detect changes in the database and write the messages to the messaging infrastructure.
  3. CDC: Just like the database triggers, we can make use of CDC to read messages from the transaction log. This way, you can rely on CDC as your source of truth as only committed transactions would show up in the CDC stream. Caveat here is you might not have direct access to the binlog/might need 3rd party systems like Debezium to read data from the transaction log.
  4. Publish-Subscribe Pattern: This pattern involves a message broker that allows multiple microservices to subscribe to specific message types. It can be a good option for high-volume, low-latency communication. So in the above case, there could be a common message broker that can be used by Order Service as well as the Analytics Service.

Sample Implementation

Here is a simple example of how you can implement the Outbox Pattern in Golang using a PostgreSQL database:

  1. Create a message struct that contains the message data:

type Message struct 
    ID        string `json:"id"`
    EventType string `json:"event_type"`
    Payload   []byte `json:"payload"`
}{        

2. Create an Outbox table in the database:

CREATE TABLE outbox 
    id uuid PRIMARY KEY,
    event_type text NOT NULL,
    payload bytes NOT NULL,
    created_at timestamp NOT NULL DEFAULT NOW()
);(        

3. Insert a message into the outbox table in a database transaction:

func sendMessage(db *sql.DB, message *Message) error 
    tx, err := db.Begin()
    if err != nil {
        return err
    }

    defer func() {
        if r := recover(); r != nil {
            tx.Rollback()
        }
    }()

    _, err := tx.Exec("INSERT INTO orders(id, order_value, order_qty) VALUES ($1, $2, $3)", ...)
    if err != nil {
        tx.Rollback()
        return err
    }

    _, err := tx.Exec("INSERT INTO outbox(id, event_type, payload) VALUES ($1, $2, $3)", ...)
    if err != nil {
        tx.Rollback()
        return err
    }

    err = tx.Commit()
    if err != nil {
        panic(err)
    }
}        

This brings us to the end of this article. We talked about the problem where the outbox pattern is really useful, the advantages of it and what the alternatives to the outbox pattern could be. We even see a sample snippet on how you could implement a transactional outbox pattern in Golang & Postgres. Please post comments on any doubts you might have and will be happy to discuss them!


Thank you for reading! I’ll be posting weekly content on distributed systems & patterns, so please like, share and subscribe to this newsletter for notifications of new posts.

Please comment on the post with your feedback, it will help me improve!?:)

Until next time, Keep asking questions & Keep learning!

Pratik Pandey

Senior Software Engineer at Booking.com | AWS Serverless Community Builder | pratikpandey.substack.com

1 年

Subscribe to my LinkedIn newsletter to get updates on any new System design posts -?https://www.dhirubhai.net/newsletters/system-design-patterns-6937319059256397824/ You can also follow me on Medium -?https://distributedsystemsmadeeasy.medium.com/subscribe

回复
Kaivalya Apte

The GeekNarrator Podcast | Staff Engineer | Follow me for #distributedsystems #databases #interviewing #softwareengineering

1 年

I like CDC (Change Data Capture) over outbox pattern because it loosely couples the application from data publishing. It is quite flexible and easy to configure. Also doesn’t need an additional table. Transaction logs already have the change data we need. Any specific use case where you think outbox pattern works better?

要查看或添加评论,请登录

Pratik Pandey的更多文章

社区洞察

其他会员也浏览了