Distributed Systems - Multi Leader Replication
We know in Leader follower model, client can able to write only by leader this if leader is down for any reason, you can't write to database.
To overcome this issue multi leader model was created which allows clients to send write requests to multiple leaders and is also called as multi leader replication. In this model, leaders will be simultaneously acts as a follower to other leaders.
Use cases for multi leader replication
- Multi-datacentre operation – In multi leader relpication, you can have a leader in each datacenter. Within each datacenter, regular leader follower replication is used and between datacenters, each datacenter’s leader replicates its changes to the leaders in other datacenters.
Advantages
a. Performance
b. Tolerance of datacenter outages
c. Tolerance of network problem
Disadvantage - Even though it has advantage, it also has a big downside:
a. The same data may be concurrently modified in two different datacenters, and those write conflicts must be resolved.
b. Autoincrementing keys, triggers, and integrity constraints can be problematic. For this reason, multi-leader replication is often considered dangerous territory that should be avoided if possible
2. Clients with offline access - Another situation in which multi-leader replication is appropriate is if you have an application that needs to continue to work while it is disconnected from the internet. For example : Calendar app
3. Collaborative editing - Real-time collaborative editing applications allow several people to edit a document simultaneously. For example, Etherpad and Google Docs allow multiple people to concurrently edit a text document or spreadsheet
Handling conflicts
- Give each write a unique ID (e.g., a timestamp, a long random number, a UUID,or a hash of the key and value). Although this approach is popular, it is dangerously prone to data loss.
- Give each replica a unique ID, and let writes that originated at a higher numbered replica always take precedence over writes that originated at a lower numbered replica. This approach also implies data loss.
- Somehow merge the values together—e.g., order them alphabetically and then concatenate them.
- Record the conflict in an explicit data structure that preserves all information, and write application code that resolves the conflict at some later time (perhaps by prompting the user).
Custom conflict resolution logic
- Onwrite - As soon as the database system detects a conflict in the log of replicated changes,it calls the conflict handler. This handler typically cannot prompt a user and it runs in a background process and it must execute quickly.
- Onread - When a conflict is detected, all the conflicting writes are stored. The next time the data is read, these multiple versions of the data are returned to the application. The application may prompt the user or automatically resolve the conflict, and write the result back to the database.
Hope you have learnt something new in this article, please like share comment that may help others too.
Happy learning!