Two-Phase Commit(2PC)?-?Distributed Design?Patterns
Pratik Pandey
Senior Software Engineer at Booking.com | AWS Serverless Community Builder | pratikpandey.substack.com
The Two-Phase Commit protocol is a distributed algorithm that ensures that a transaction is either committed or rolled back consistently across all nodes in a distributed system. The protocol involves two phases, as the name suggests.
Phase 1 — Prepare Phase: The coordinator node sends a prepare message to all participating nodes, asking them if they are ready to commit the transaction. Each participant acquires a “lock” on the resource/s and replies with either a Yes or No message, indicating whether they can commit.
Phase 2 — Commit Phase: The coordinator decides whether to commit or abort the transaction based on the responses received in the Prepare phase. If all participants have responded with a Yes message, the coordinator sends a commit message to all the participants. If any participant has responded with a No message, the coordinator sends an abort message to all the participants, and the transaction is rolled back.
Advantages of Two-Phase Commit Protocol
- Consistency: 2PC guarantees that either all participating nodes commit the transaction or all of them roll it back, ensuring consistency across the distributed system.
- Atomicity: The 2PC protocol ensures that the transaction is an atomic operation, meaning it either completes successfully or not at all.
- Simplicity: The 2PC protocol is relatively simple and easy to understand, making it an attractive option for coordinating transactions in distributed systems.
Pitfalls of Two-Phase Commit Protocol
- Scalability: The 2PC protocol can be challenging to scale for large distributed systems. The protocol requires all participating nodes to coordinate with each other, increasing communication overhead as the number of nodes in the system grows.
- Single Point of Failure: The coordinator node is a single point of failure in the 2PC protocol. If the coordinator fails, the entire transaction fails, and the system will need to restart the transaction from scratch.
- Performance: The 2PC protocol can have a significant impact on the performance of the system, particularly in high write throughput scenarios.
Implementation Caveats
It’s important to keep the following items in mind, which will help you in implementing a stable 2PC protocol -
- The coordinator acts like an orchestrator in a 2PC protocol and since it manages the state of a transaction, it’s important for the coordinator to be able to recover from failures. To achieve this, it’s important for the coordinator to persist its state to disk, such that the coordinator can reference the state in the disk after recovering from a crash.
Eg: When a coordinator starts a transaction, it persists in the state that it is sending prepare phase calls to different services. Once it gets the response from the services, it persists the responses on disk, before sending out the commit phase messages. So now, even if the coordinator crashes, it can recover by sending commit messages to the different services. Something like a WAL really helps here.
- Once a service says yes to a transaction in the prepare phase, it needs to honour that whenever a coordinator sends a commit message for that transaction. This means that we do not want to use timers or leases to timebound a service’s prepare phase(OR have a timer with a high timeout like 10 minutes). With this decision, we’re expecting the coordinator to not be down a lot, since it's a critical service & we need to ensure high availability of the coordinator.
This brings us to the end of this article. We talked about the capability of 2 phase commit protocol, its advantages, disadvantages and some caveats to take care of while implementing 2 phase commit protocol. You should also take a look at some alternatives like Transactional Outbox Pattern. Please post comments on any doubts you might have and will be happy to discuss them!
Thank you for reading! I’ll be posting weekly content on distributed systems & patterns, so please like, share and subscribe to this newsletter for notifications of new posts.
Please comment on the post with your feedback, will help me improve! :)
Until next time, Keep asking questions & Keep learning!
Engineering Manager at HP
1 年Well written, maybe cover Saga for distributed transactions in the next article ??
Engineering | Strategy
1 年Nicely written. Crisp, precise and concise.