Database Replication: A Deeper Dive

Database Replication: A Deeper Dive

In the previous episode, we explored replication as a horizontal scaling method, focusing on its application at the web application layer. We discussed how replication can be categorized as stateful or stateless and examined strategies like caching, sticky sessions, and session clustering to optimize performance and reduce latency.

Building on that foundation, today's episode delves into database replication, another essential facet of horizontal scaling. By replicating the database layer, we can achieve higher availability, better read scalability, and improved performance for distributed systems. Let's explore the two main approaches to database replication: Master-Slave (Primary-Secondary) and Master-Master (Peer-to-Peer) replication.


Database Replication

Database replication is the process of creating multiple copies of a database to improve performance, scalability, and availability.

When implementing database replication, two primary architectures are commonly used:


1. Master-Slave (Primary-Secondary) Replication

This model involves multiple replicas of the database:

  • A master (primary) handles both read and write operations.
  • Secondary replicas are provisioned for read operations only.

To keep these replicas synchronized, two update methods are used:


a. Asynchronous Replication

Updates to the secondary replicas occur after the master completes its write operation.

When a write operation is sent to the database, the primary replica first completes the update. After completion, it gradually propagates the changes to the secondary replicas. MongoDB provides this functionality by default when using Atlas clusters.

  • Pros:

Low latency for write operations on the master.

  • Cons:

Data loss risk if the master fails before updates propagate to secondaries.

Eventual consistency between replicas after the write operation completes.


Illustration of Master-slave Replication in database
Operation on Master and Delayed Update to Secondary Replicas

b. Synchronous Replication

Updates to both master and secondary replicas occur simultaneously during a write operation.

  • Pros:

Always consistent data between the master and secondaries.

  • Cons:

High latency for write operations due to synchronization overhead.

Risk of deadlocks if a replica becomes unavailable during a write operation.

Proximity between replicas is crucial to minimize latency.


Difference between asynchronous replication and synchronous replication
Asynchronous replication vs Synchronous replication

Advantages of Master-Slave Replication:

  • High read scalability.
  • High read availability.
  • No write conflicts.


2. Master-Master (Peer-to-Peer) Replication

Master-Master replication reduces write latency by allowing all replicas to handle both read and write operations. Each replica synchronizes with others, ensuring bi-directional data replication.

Use Case:

  • Ideal for geographically distributed systems where proximity to users is essential for low write latency.

Pros:

  • High read scalability.
  • High read and write availability.

Cons:

  • Risk of write conflicts if simultaneous updates occur on different replicas.
  • Transaction ordering issues due to time zone differences (data skew).


Representation of master to master database replication
World map with multiple database nodes showing bi-directional replication

Choosing the Right Replication Strategy

Your choice between Master-Slave and Master-Master replication depends on:

  • Workload: High read-heavy workloads may benefit more from Master-Slave setups.
  • Latency requirements: Geographically distributed systems favor Master-Master replication.
  • Consistency needs: Synchronous replication ensures data accuracy, while asynchronous replication prioritizes speed.


Difference between Master-Slave Replication and Master-Master Replication
Master-Slave vs Master-Master Replication

This episode delves into the core concepts of database replication and its practical applications. In upcoming episodes, we'll explore asynchronous processing in depth, including its use cases in e-commerce applications.

Stay tuned!

Promise Uchegbunam

Software Engineer | Cyber Security | Building & Breaking Systems | Go & TypeScript

2 个月

Great Article Joel Ndoh. Might wanna consider using “Primary/Secondary” as opposed to “Master/Slave” going forward. The “Master/Slave” connotation is continuously being phased out in modern DB architecture.

要查看或添加评论,请登录

Joel Ndoh的更多文章

社区洞察

其他会员也浏览了