Distributed Systems - Multi Leader Replication

Distributed Systems - Multi Leader Replication

We know in Leader follower model, client can able to write only by leader this if leader is down for any reason, you can't write to database. 

To overcome this issue multi leader model was created which allows clients to send write requests to multiple leaders and is also called as multi leader replication. In this model, leaders will be simultaneously acts as a follower to other leaders.

Use cases for multi leader replication 

  1. Multi-datacentre operation – In multi leader relpication, you can have a leader in each datacenter. Within each datacenter, regular leader follower replication is used and between datacenters, each datacenter’s leader replicates its changes to the leaders in other datacenters. 

Advantages  

a. Performance 

b. Tolerance of datacenter outages 

c. Tolerance of network problem

Disadvantage - Even though it has advantage, it also has a big downside: 

a. The same data may be concurrently modified in two different datacenters, and those write conflicts must be resolved. 

b. Autoincrementing keys, triggers, and integrity constraints can be problematic. For this reason, multi-leader replication is often considered dangerous territory that should be avoided if possible 

2. Clients with offline access - Another situation in which multi-leader replication is appropriate is if you have an application that needs to continue to work while it is disconnected from the internet. For example : Calendar app  

3. Collaborative editing - Real-time collaborative editing applications allow several people to edit a document simultaneously. For example, Etherpad and Google Docs allow multiple people to concurrently edit a text document or spreadsheet 

Handling conflicts 

  1. Give each write a unique ID (e.g., a timestamp, a long random number, a UUID,or a hash of the key and value). Although this approach is popular, it is dangerously prone to data loss. 
  2. Give each replica a unique ID, and let writes that originated at a higher numbered replica always take precedence over writes that originated at a lower numbered replica. This approach also implies data loss. 
  3. Somehow merge the values together—e.g., order them alphabetically and then concatenate them. 
  4. Record the conflict in an explicit data structure that preserves all information, and write application code that resolves the conflict at some later time (perhaps by prompting the user). 

Custom conflict resolution logic 

  1. Onwrite - As soon as the database system detects a conflict in the log of replicated changes,it calls the conflict handler. This handler typically cannot prompt a user and it runs in a background process and it must execute quickly. 
  2. Onread - When a conflict is detected, all the conflicting writes are stored. The next time the data is read, these multiple versions of the data are returned to the application. The application may prompt the user or automatically resolve the conflict, and write the result back to the database. 
No alt text provided for this image

Hope you have learnt something new in this article, please like share comment that may help others too.

Happy learning!

要查看或添加评论,请登录

Divagar Carlmarx的更多文章

  • Processing large amount of CSV data using JAVA

    Processing large amount of CSV data using JAVA

    Have you worked with large amount of csv DATA in GBs ?? And you have memory constraints ?? This might help for you…

    1 条评论
  • Fell in love with Scala

    Fell in love with Scala

    I was a hard core JAVA developer in both my professional and learning journey, but recently for a reason i have started…

  • Scala - Sealed Class Hierarchies

    Scala - Sealed Class Hierarchies

    In my previous article i had shared you regarding Option feature in Scala, in this article come lets discuss about…

  • Scala - NULL handling with MAP

    Scala - NULL handling with MAP

    Sharing three useful types that express a very useful concept i learned today, for NULL handling. Most languages have a…

  • WHY and HOW I started using IntelliJ IDE and SCALA

    WHY and HOW I started using IntelliJ IDE and SCALA

    I was using Eclipse IDE for java enterprise development from beginning of my career and learning journey. In my life…

  • Product based company team management strategies for productivity

    Product based company team management strategies for productivity

    I am sharing my knowledge i got in my professional and personal life as software developer for team management. Lets…

  • Big Data Volume

    Big Data Volume

    Big Data Volume Data volume is characterized by the amount of data that is generated continuously. Different data types…

    2 条评论
  • Distributed Systems - Replication

    Distributed Systems - Replication

    Replication means keeping a copy of the same data on multiple machines that are connected via a network. Reasons for…

  • Transaction Processing or Analytics ?

    Transaction Processing or Analytics ?

    Transaction processing systems In the early days of business data processing, a write to the database typically…

  • Designing key value database with btree

    Designing key value database with btree

    Introduced in 1970 and called “ubiquitous” less than 10 years later , B-trees have stood the test of time very well…

社区洞察

其他会员也浏览了