Exploring Data Replication in Depth

Exploring Data Replication in Depth

Replication means making and keeping copies of data on multiple computers connected by a network. There are several reasons to do this:

  • To keep data close to users in different locations (reducing delay)
  • To ensure the system works even if one computer fails
  • To handle a lot of read requests by adding more computers

Replication becomes complex when data changes frequently. There are different methods to update data across computers: single-leader, multi-leader, and leaderless replication.

We will discuss each method in detail, including their pros and cons.

Single Leader Replication

Each node with a copy of the database is called a replica. Every change to the database must be applied to all replicas; otherwise, some replicas will have stale data.

The most common solution is leader-based replication (also known as active/passive or master-slave replication).

Leader-Based (Master-Slave) Replication

Here's how it works:

  1. One replica is chosen as the leader (also called master or primary). When a client wants to write to the database, it sends the request to the leader, which writes the new data to its storage first.
  2. After the leader writes the new data, it sends the change to all its followers (also known as read replicas, slaves, secondaries, or hot standbys) using a replication log or change stream. Each follower reads the log from the leader and updates its database copy by applying the changes in the same order.
  3. When a user wants to read data, they can query either the leader or any follower. However, only the leader can accept and execute write requests.

Data Replication Leader and Followers

Synchronous vs Asynchronous Replication

Synchronous Replication:

  • The leader waits for all followers to confirm they've received and processed the changes before telling the user the write was successful.
  • Users only see changes after the leader gets confirmation from all followers.
  • Good: Follower copies are always up-to-date with the leader (Strong Consistency).
  • Bad: If a follower is down, the leader can't confirm changes and all writes are blocked.


Asynchronous Replication:

  • The leader tells the user the write was successful as soon as it saves the data locally.
  • It doesn't wait for followers to confirm they've received and processed the changes.
  • Good: The leader can keep accepting writes even if followers are behind or down.
  • Bad: If the leader fails, any uncopied changes are lost.

Hybrid Approach:

Some systems use a mix, called semi-synchronous replication:

  • One follower is updated synchronously, others asynchronously.
  • The leader tells the user the write was successful after the synchronous follower confirms, without waiting for asynchronous followers. (Eventual Consistency)

This approach balances the benefits and drawbacks of both methods.



Handling New Node Addition

When adding a new replica:

  1. The new node needs to sync with the leader before serving traffic.
  2. Simply copying data from the leader doesn't work if the leader is writing new data continuously.
  3. Locking the leader for copying prevents writes, which goes against high availability goals.

A better approach:

  1. Take a consistent snapshot of the leader's database at a specific time.
  2. Copy this snapshot to the new follower.
  3. The new follower then asks the leader for all changes since the snapshot time.
  4. Once the follower processes this backlog, it's caught up and ready to serve traffic.

Dealing with Node Outages

Nodes can fail due to hardware issues, network problems, or planned maintenance. The system should remain available despite node failures.

Follower Failure:

  • A follower can easily recover using its log.
  • It knows its last executed transaction and asks the leader for all changes during its downtime.

Leader Failure:

This requires "failover":

  1. Promote a follower to become the new leader.
  2. Re-Configure clients to send new writes to the new leader.
  3. Make other followers accept changes from the new leader.

Replication Log Implementation

  1. Statement-based replication:

- The leader records the exact SQL statements (INSERT, UPDATE, DELETE) that modify data.

- These statements are sent to followers, who execute them in the same order.

- Problems arise with functions like NOW(), RAND(), or AUTO_INCREMENT, as they may produce different results on each replica.

- Some statements might have side effects or depend on the database's current state, making them unreliable for replication.

2. Write Ahead Log (WAL):

- This is a low-level log of changes to database pages.

- It records exact byte changes in specific disk blocks.

- Highly efficient for the database engine but very version-specific.

- Makes it difficult to replicate between different database versions or systems.

- Replication process is closely tied to the internal workings of the storage engine.

3. Logical (row-based) log replication:

- Records changes at a higher level of abstraction.

- Logs might include: "Insert row in table X with these column values" or "Update row in table Y, set column A to value V where primary key is K".

- More flexible than WAL, allowing replication between different database versions or even different database systems.

- Easier for external applications to parse, useful for data warehousing or custom replication solutions.

Multi-Leader Replication

  • Allows writes to be accepted by multiple nodes, not just one leader.
  • Useful in scenarios like multi-datacenter operations, where each datacenter has its own leader.
  • Improves write performance and tolerance of datacenter outages.
  • Complicates data consistency as conflicts can occur.

Handling Write Conflicts:

Conflict example: User A changes a customer's email in datacenter 1, while User B changes the same customer's phone number in datacenter 2.

Conflict resolution strategies:

  1. Avoid conflicts: Route all writes for a particular record to a specific datacenter.
  2. Last Write Wins (LWW): Choose the write with the latest timestamp.
  3. Merge conflicting writes: Somehow combine the conflicting changes.
  4. Custom conflict resolution logic: Prompt user or use application-specific rules.

Single Leader vs Multi-Leader:

  • Single leader is simpler but has a single point of failure for writes.
  • Multi-leader can handle datacenter outages better but is more complex to manage.
  • Multi-leader can offer lower latency for writes in geographically distributed systems.

Leaderless Replication

  • Clients can send writes to multiple replicas, or to a coordinator that does this on their behalf.
  • No single node determines the order of writes.
  • Requires a quorum approach: a write must be acknowledged by a minimum number of nodes to be considered successful.
  • Read operations also query multiple nodes to ensure up-to-date data.
  • Systems must handle temporary inconsistencies and have mechanisms to bring out-of-date nodes back in sync.

Each replication strategy balances consistency, availability, and partition tolerance differently, as described by the CAP theorem. The best choice depends on your system's specific needs.

Handling Node Outages in Leaderless Systems

In leaderless systems, there's no failover process. For example:

  • Client 1 wants to update a last name
  • It sends the request to all three replicas
  • Replica 3 is down and fails to receive the write
  • If two acknowledgements are enough for success, the client ignores the failed write to Replica 3

To ensure data consistency and availability:

Quorum Writes and Reads:

  • 'w' replicas must confirm a write before telling the client it succeeded
  • 'r' replicas must respond to a read request
  • If w + r > n (total replicas), we expect up-to-date data on reads
  • Example: With n=3, w=2, r=2, the system can handle one node being down

This approach allows:

  • Processing writes if w < n and a node is down
  • Processing reads if r < n and a node is down

Version Vectors:

  • Used to track concurrent writes
  • Each replica has its own version number for each key
  • Replicas track version numbers from other replicas
  • Helps determine which values to overwrite
  • The collection of version numbers is called a version vector
  • Version vectors help distinguish between overwrites and concurrent writes

we've explored data replication, including different configurations, how they work, potential issues, and solutions. The choice of replication strategy should be based on your specific system requirements and trade-offs between consistency, availability, and partition tolerance.


要查看或添加评论,请登录

Bala Subramanian的更多文章

社区洞察

其他会员也浏览了