登录查看更多内容

Distributed Systems - Replication

Divagar Carlmarx

?? Building microservices with k8 for ?? ?? ?? ??

发布日期: 2020年1月6日

+ 关注

Replication means keeping a copy of the same data on multiple machines that are connected via a network.

Reasons for doing replication

To keep data geographically close to your users for reducing latency
To allow the system to continue working even if some of its parts have failed for high availability
To scale out the number of machines that can serve read queries for increasing read throughput

How Leader and followers systems work ?

One of the node will be designated as leader and others as followers
For writing clients must send write requests to leader only
For reading clients can send read requests to either leader or followers

Types of replication

Asynchronization - the leader does not wait until it gets confirmation from all the followers that it have received the write
Synchronization - the leader waits for confirmation from all the followers that it have received the write before reporting to users and other clients

Replication high availability tip

The advantage of synchronous replication is that the follower is guaranteed to have data up to date with leader. If suddenly leader goes down, we can be sure that the data is still available on the follower. For that reason, it is impractical for all followers to be synchronous because any one node outage would cause the whole system to grind to a halt so you must enable synchronous replication on one of the followers and the others are asynchronous. This type of configuration is called semi-synchronous.

Steps Setting of new followers

From time to time, you need to set up new followers—perhaps to increase the number of replicas, or to replace failed nodes.

Take a consistent snapshot of the leader’s database at some point in time if possible, without taking a lock on the entire database.
Copy the snapshot to the new follower node.
The follower connects to the leader and requests all the data changes that have happened since the snapshot was taken.
When the follower has processed the backlog of data changes since the snapshot, we say it has caught up. It can now continue to process data changes from the leader as they happen.

Note : In some system like elastic search this process is completely automated

Node outages

It can occur because of the following reasons and it can be both leader or follower.

Fault
Planned maintenance

Catch up recovery for follower

On its local disk, each follower should keep a log of the data changes it has received from the leader.
If follower goes down, on restart it should read log and sync up with leader easily.

Failover for leader

Leader failover is the real challenge and problem in production because of the following

System needs to elect next leader, which is quite expensive process
System can't able to serve for write requests until it elects next leader

Failover can be manual or automatic and has the following process

Determine that the leader has failed.
Choose a next leader
Reconfiguring the system to use the new leader.

Problems with leader failover

If asynchronous replication is used, the new leader may not have received all the writes from the old leader before it failed.
Discarding writes is especially dangerous if other storage systems outside of the database need to be coordinated with the database contents.
In certain fault scenarios, it could happen that two nodes both believe that they are the leader. This situation is called split brain, and it is dangerous because will result in data corrupt.
What is the right timeout before the leader is declared dead? A longer timeout means a longer time to recovery in the case where the leader fails

Problems in replication

When insert or update action happens in master it will take some time for replicating the case across all node, typically it will be less than seconds but some time lags can occur for more than minutes because of downtime and network problems in networks and If any user tries to read data from any node that is not in sync with master, they won't be able to see that data is called replication lag.

Reading your own writes - A user makes a write, followed by a read from a stale replica. To prevent this anomaly, we need read-after-write consistency.

Solution – Always read latest updated values from leader with centralized meta data

2. Monotonic reads - A user first reads from a fresh replica, then from a stale replica. Time appears to go backward. To prevent this anomaly, we need monotonic reads.

Solution - One way of achieving monotonic reads is to make sure that each user always makes their reads from the same replica (different users can read from different replicas). For example, the replica can be chosen based on a hash of the user ID, rather than randomly. However, if that replica fails, the user’s queries will need to be rerouted to another replica.

3. Consistent Prefix reads - If some partitions are replicated slower than others, an observer may see the answer before they see the question.

Solution - One solution is to make sure that any writes that are causally related to each other are written to the same partition—but in some applications that cannot be done efficiently.

要查看或添加评论，请登录

Divagar Carlmarx的更多文章

Processing large amount of CSV data using JAVA

2022年3月15日

Processing large amount of CSV data using JAVA

Have you worked with large amount of csv DATA in GBs ?? And you have memory constraints ?? This might help for you…

1 条评论
Fell in love with Scala

2020年6月3日

Fell in love with Scala

I was a hard core JAVA developer in both my professional and learning journey, but recently for a reason i have started…
Scala - Sealed Class Hierarchies

2020年5月28日

Scala - Sealed Class Hierarchies

In my previous article i had shared you regarding Option feature in Scala, in this article come lets discuss about…
Scala - NULL handling with MAP

2020年5月27日

Scala - NULL handling with MAP

Sharing three useful types that express a very useful concept i learned today, for NULL handling. Most languages have a…
WHY and HOW I started using IntelliJ IDE and SCALA

2020年5月22日

WHY and HOW I started using IntelliJ IDE and SCALA

I was using Eclipse IDE for java enterprise development from beginning of my career and learning journey. In my life…
Product based company team management strategies for productivity

2020年4月30日

Product based company team management strategies for productivity

I am sharing my knowledge i got in my professional and personal life as software developer for team management. Lets…
Big Data Volume

2020年2月1日

Big Data Volume

Big Data Volume Data volume is characterized by the amount of data that is generated continuously. Different data types…

2 条评论
Distributed Systems - Multi Leader Replication

2020年1月6日

Distributed Systems - Multi Leader Replication

We know in Leader follower model, client can able to write only by leader this if leader is down for any reason, you…
Transaction Processing or Analytics ?

2020年1月4日

Transaction Processing or Analytics ?

Transaction processing systems In the early days of business data processing, a write to the database typically…
Designing key value database with btree

2020年1月1日

Designing key value database with btree

Introduced in 1970 and called “ubiquitous” less than 10 years later , B-trees have stood the test of time very well…

See all articles

Distributed Systems - Replication

Divagar Carlmarx

?? Building microservices with k8 for ?? ?? ?? ??

Divagar Carlmarx的更多文章

社区洞察

其他会员也浏览了

Mastering Kafka Resilience: The Art of Balancing High Availability and Cost-Effectiveness

Fault tolerance in distributed systems

Monitoring and managing Kafka: a deep dive for architects

Guide to Kubernetes StatefulSet – When to Use It and Examples

??? Advanced Kubernetes Security: Protect and Optimize Your Cluster!

Overcoming the many challenges faced by a Kafka support team

Demystifying the Kubernetes Controller Manager: Managing Cluster State Behind the Scenes

n-Tier Architecture for Multimedia Data Handling in AWS

Distributed Transaction Handling in Microservice Architecture

Monitoring Kubernetes with Prometheus and Grafana

Divagar Carlmarx的更多文章

Processing large amount of CSV data using JAVA

Fell in love with Scala

Scala - Sealed Class Hierarchies

Scala - NULL handling with MAP

WHY and HOW I started using IntelliJ IDE and SCALA

Product based company team management strategies for productivity

Big Data Volume

Distributed Systems - Multi Leader Replication

Transaction Processing or Analytics ?

Designing key value database with btree

社区洞察

其他会员也浏览了

Mastering Kafka Resilience: The Art of Balancing High Availability and Cost-Effectiveness

Fault tolerance in distributed systems

Monitoring and managing Kafka: a deep dive for architects

Guide to Kubernetes StatefulSet – When to Use It and Examples

??? Advanced Kubernetes Security: Protect and Optimize Your Cluster!

Overcoming the many challenges faced by a Kafka support team

Demystifying the Kubernetes Controller Manager: Managing Cluster State Behind the Scenes

n-Tier Architecture for Multimedia Data Handling in AWS

Distributed Transaction Handling in Microservice Architecture

Monitoring Kubernetes with Prometheus and Grafana