Distributed Systems - Replication

Distributed Systems - Replication

Replication means keeping a copy of the same data on multiple machines that are connected via a network. 

Reasons for doing replication 

  1. To keep data geographically close to your users for reducing latency 
  2. To allow the system to continue working even if some of its parts have failed for high availability 
  3. To scale out the number of machines that can serve read queries for increasing read throughput 

 How Leader and followers systems work ?  

  1. One of the node will be designated as leader and others as followers 
  2. For writing clients must send write requests to leader only 
  3. For reading clients can send read requests to either leader or followers 

Types of replication 

  1. Asynchronization - the leader does not wait until it gets confirmation from all the followers that it have received the write   
  2. Synchronization - the leader waits for confirmation from all the followers that it have received the write before reporting to users and other clients 

Replication high availability tip 

  The advantage of synchronous replication is that the follower is guaranteed to have data up to date with leader. If suddenly leader goes down, we can be sure that the data is still available on the follower. For that reason, it is impractical for all followers to be synchronous because any one node outage would cause the whole system to grind to a halt so you must enable synchronous replication on one of the followers and the others are asynchronous. This type of configuration is called semi-synchronous. 

Steps Setting of new followers 

  From time to time, you need to set up new followers—perhaps to increase the number of replicas, or to replace failed nodes. 

  1. Take a consistent snapshot of the leader’s database at some point in time if possible, without taking a lock on the entire database. 
  2. Copy the snapshot to the new follower node. 
  3. The follower connects to the leader and requests all the data changes that have happened since the snapshot was taken. 
  4. When the follower has processed the backlog of data changes since the snapshot, we say it has caught up. It can now continue to process data changes from the leader as they happen. 

Note : In some system like elastic search this process is completely automated 

Node outages 

   It can occur because of the following reasons and it can be both leader or follower. 

  1. Fault 
  2. Planned maintenance  

Catch up recovery for follower 

  1. On its local disk, each follower should keep a log of the data changes it has received from the leader. 
  2. If follower goes down, on restart it should read log and sync up with leader easily.   

Failover for leader  

   Leader failover is the real challenge and problem in production because of the following 

  1. System needs to elect next leader, which is quite expensive process 
  2. System can't able to serve for write requests until it elects next leader 

 Failover can be manual or automatic and has the following process 

  1. Determine that the leader has failed. 
  2.  Choose a next leader 
  3. Reconfiguring the system to use the new leader. 

Problems with leader failover 

  1. If asynchronous replication is used, the new leader may not have received all the writes from the old leader before it failed. 
  2. Discarding writes is especially dangerous if other storage systems outside of the database need to be coordinated with the database contents. 
  3. In certain fault scenarios, it could happen that two nodes both believe that they are the leader. This situation is called split brain, and it is dangerous because will result in data corrupt. 
  4. What is the right timeout before the leader is declared dead? A longer timeout means a longer time to recovery in the case where the leader fails 

Problems in replication 

  When insert or update action happens in master it will take some time for replicating the case across all node, typically it will be less than seconds but some time lags can occur for more than minutes because of downtime and network problems in networks and If any user tries to read data from any node that is not in sync with master, they won't be able to see that data is called replication lag.  

  1. Reading your own writes - A user makes a write, followed by a read from a stale replica. To prevent this anomaly, we need read-after-write consistency. 
No alt text provided for this image

Solution – Always read latest updated values from leader with centralized meta data 


2. Monotonic reads - A user first reads from a fresh replica, then from a stale replica. Time appears to go backward. To prevent this anomaly, we need monotonic reads. 

No alt text provided for this image


Solution - One way of achieving monotonic reads is to make sure that each user always makes their reads from the same replica (different users can read from different replicas). For example, the replica can be chosen based on a hash of the user ID, rather than randomly. However, if that replica fails, the user’s queries will need to be rerouted to another replica. 

3. Consistent Prefix reads - If some partitions are replicated slower than others, an observer may see the answer before they see the question. 

No alt text provided for this image

Solution - One solution is to make sure that any writes that are causally related to each other are written to the same partition—but in some applications that cannot be done efficiently. 

要查看或添加评论,请登录

Divagar Carlmarx的更多文章

  • Processing large amount of CSV data using JAVA

    Processing large amount of CSV data using JAVA

    Have you worked with large amount of csv DATA in GBs ?? And you have memory constraints ?? This might help for you…

    1 条评论
  • Fell in love with Scala

    Fell in love with Scala

    I was a hard core JAVA developer in both my professional and learning journey, but recently for a reason i have started…

  • Scala - Sealed Class Hierarchies

    Scala - Sealed Class Hierarchies

    In my previous article i had shared you regarding Option feature in Scala, in this article come lets discuss about…

  • Scala - NULL handling with MAP

    Scala - NULL handling with MAP

    Sharing three useful types that express a very useful concept i learned today, for NULL handling. Most languages have a…

  • WHY and HOW I started using IntelliJ IDE and SCALA

    WHY and HOW I started using IntelliJ IDE and SCALA

    I was using Eclipse IDE for java enterprise development from beginning of my career and learning journey. In my life…

  • Product based company team management strategies for productivity

    Product based company team management strategies for productivity

    I am sharing my knowledge i got in my professional and personal life as software developer for team management. Lets…

  • Big Data Volume

    Big Data Volume

    Big Data Volume Data volume is characterized by the amount of data that is generated continuously. Different data types…

    2 条评论
  • Distributed Systems - Multi Leader Replication

    Distributed Systems - Multi Leader Replication

    We know in Leader follower model, client can able to write only by leader this if leader is down for any reason, you…

  • Transaction Processing or Analytics ?

    Transaction Processing or Analytics ?

    Transaction processing systems In the early days of business data processing, a write to the database typically…

  • Designing key value database with btree

    Designing key value database with btree

    Introduced in 1970 and called “ubiquitous” less than 10 years later , B-trees have stood the test of time very well…

社区洞察

其他会员也浏览了