登录查看更多内容

Consistency & Consistency Levels(II) — Distributed Data Stores

Pratik Pandey

Senior Software Engineer at Booking.com | AWS Serverless Community Builder | pratikpandey.substack.com

发布日期: 2022年6月29日

Before going into this article, I’d advise you to go through the following articles to get the complete context.

Now that we’re on the same page, we have seen before that we could define the application consistency based on different consistency models.

Consistency model?is a contract between a (distributed)system and the applications that run on it. This model is a set of guarantees made by the distributed system so that the client can have expected consistency behaviour across read & write operations.

From a distributed systems perspective, we can also define consistency levels that also end up impacting the consistency guarantees that our application can make.

Consistency Level?is the number of replica nodes that must acknowledge a read/write request to be successful for the entire request to be considered successful.

Replication Factor?(RF) is equivalent to the number of nodes where any write is replicated. So, if I have a 5 node cluster and have set RF to 3, my writes are replicated to 3 of the 5 nodes always.

Let’s take different consistency levels and see the consistency guarantees that our application can offer -

Read CL = ONE & Write CL = ALL

With this setting, we’ll read the data from any one replica, but the write request is considered successful, only if it's acknowledged by all replicas. Since the write is always synced to all replicas, reading from any replica will always show the latest write.

So, this configuration allows us to have a Strongly Consistent(Linearizable) system.

Pros: Read throughput is very high as read happens only on one replica. Strongly Consistent.

Cons: Setting write CL to ALL increases the write latency, impacting the write performance & throughput as the write needs to be synced to all replicas. The availability of the application is also impacted as we always need all replicas to be available for any successful writes. Partitions cannot be tolerated by the system.

2.?Read CL = ALL & Write CL = ONE

With this setting, the write will be replicated to the number of nodes defined by RF but will be considered successful as soon as one replica acknowledges it. The read request is considered successful, only after fetching the data from all replicas & returning the latest record. Since the read is always fetching data from all replicas, and we know that one replica has the latest write, we’ll always see the latest write.

So, this configuration allows us to have a Strongly Consistent(Linearizable) system.

Pros: Write throughput is very high since write is successful as soon as one node acknowledges it. Strongly Consistent.

Cons: Setting read CL to ALL increases the read latency, impacting the read performance & throughput. The availability of the application is also impacted as we always need all replicas to be available for any successful writes. Partitions cannot be tolerated by the system.

3. Read CL = ONE & Write CL = ONE

With this setting, the write will be replicated to the number of nodes defined by RF but will be considered successful as soon as one replica acknowledges it. The read request will hit any one node & will get the data. Since the node might/might not have the data(If the master node went down & if the data was replicated to it or not), we cannot guarantee whether the latest write will be read immediately.

However, once the write is replicated to the replicas defined by RF, we will be able to see the latest write and hence such a system will be Eventually Consistent.

Pros: Write throughput is very high since write is successful as soon as one node acknowledges it. Read throughput is very high as well, as it just needs to fetch data from one node.

领英推荐

?? Unlock the Power of Your Data: 5 Essential Features…

XenonStack 3 个月前

Data Observability and Resilience at Scale

Birendra Kumar Sahu 3 个月前

The Data Silo Challenge: Breaking Down Barriers to…

Savan 3 个月前

Cons: Cannot have Strong Consistency with this configuration.

4.?Read CL = QUORUM & Write CL = QUORUM

With this setting, we’ll be writing the data to a majority of the nodes and we’ll be reading the data from a majority of nodes as well. Hence we are guaranteed that at least 1 node will have the latest data while reading.

So, this configuration looks like it will allow us to have a Strongly Consistent(Linearizable) system. (Not the case always! Read the Partial Writes Section for more info)

Eg: If we have a 9-node cluster, writes will be done to 5 nodes(4 nodes might not have the latest write). Since we need 5 replicas for the read requests as well, even if we read from the 4 nodes that don’t have the latest write, we’ll have 1 node with the latest copy.

Pros: Both write & read throughputs are good(not as bad as reading/writing to all nodes, but not as good as ONE node). Quorum allows a balanced approach, where you don’t need to sacrifice either read or write performance alone.

Cons:?Performance not as well as reading & writing to one node :)

Partial Writes Scenario -

We talked about quorum writes and reads & how it leads to a Strongly Consistent System. However, the caveat here is that your writes are successful. But you must be thinking, if the writes failed, why would it not satisfy the strongly consistent system?

Let’s take the happy path first. Below, we wrote X=1 to a Quorum of nodes in the cluster and then issued a read request from a Quorum of nodes. We see that the read returned the latest value of X.

Read Request from a Quorum of Node 2 & 3 will return the Old Value(1)

Now, let's think about a scenario where the write happens on a quorum of nodes. It succeeds on Node 1 but fails on Node 2, hence marking the write as failed. However, the write done on Node 1 isn’t rolled back(based on our implementation).

Read Request from a Quorum of Node 1& 3 will return the Updated Value(2)

Now, if we issue a read request, and if the request goes to a quorum of Node 2 & Node 3, we get back 1(value before the failed write) as a result. Check the above illustration.

Now, if we issue a read request, and if the request goes to a quorum of Node 1& Node 3, we get back 2(value after the failed write) as a result. Check the above illustration.

This would also trigger a?Read Repair(Topic for another article) and the value of 2 will be synced to other nodes.

As you can see above, we can get different values for our reads in the case of partial writes to our data store & hence in such a scenario, our data store isn't strongly consistent.

P.S — The assumption from our client is to retry any failed/timed-out operations & in such cases, we can see our data store acting like eventually consistent data stores.

This brings us to the end of the Consistency Series. Hope you learned something new from it. The data store referenced is Cassandra & these are the concepts that Cassandra uses. However, these concepts can be applied to any distributed data store.

Thank you for reading! I’ll be posting weekly content on distributed systems & patterns, so please like, share and subscribe to this?newsletter?for notifications of new posts.

Please comment on the post with your feedback, will help me improve! :)

Until next time, Keep asking questions & Keep learning!

Distributed Systems Made Easy

7,968 位关注者

Shreeharsha Voonna

Principal Software Engineer at Oracle Cloud Infra (OCI) | Nutanix | Citrix | Samsung | NITK Surathkal

2 个月

Thanks for great article Pratik. Follow up question: In the Partial Writes Scenario, if write failed to achieve replicating the data to quorum of 2 nodes (succeeds on Node 1 but fails on Node 2), why is write done on Node1 rolled back? Application logic should do it provided it needed strong consistency. If it was not rolled back and handled correctly, is it not incorrect to say that Read CL = QUORUM & Write CL = QUORUM is not helping with strong consistency?

Shashi bhushan

2 年

Very illustrative great work ??

1 次回应

Pratik Pandey

Senior Software Engineer at Booking.com | AWS Serverless Community Builder | pratikpandey.substack.com

2 年

Please subscribe to my newsletter -?https://www.dhirubhai.net/newsletters/system-design-patterns-6937319059256397824/ Also, you can connect with me/book mock interviews with me on Topmate -?https://topmate.io/pratik_pandey You can follow me on Medium -?https://distributedsystemsmadeeasy.medium.com/subscribe

查看更多评论

要查看或添加评论，请登录

Pratik Pandey的更多文章

Database Intermediate Series: Change Data Capture(II)

2024年5月29日

Database Intermediate Series: Change Data Capture(II)

Our previous post discussed Change Data Capture and how to implement it using triggers. In this post, we’ll explore how…

1 条评论
Database Intermediate Series: Change Data Capture(I)

2024年4月23日

Database Intermediate Series: Change Data Capture(I)

Change Data Capture (CDC) refers to identifying and capturing changes made to data in a database and then delivering…

2 条评论
Database Intermediate Series: SQL Isolation Levels Internals

2024年4月4日

Database Intermediate Series: SQL Isolation Levels Internals

In our last post, we talked about Database Isolation Levels and how different Isolation Levels allow us to balance the…

1 条评论
Database Basics Series: Understanding SQL Isolation Levels

2024年3月21日

Database Basics Series: Understanding SQL Isolation Levels

We are starting a new series on Databases, covering Basic, Intermediate, and Advanced concepts. This is the first…

6 条评论
Go Concurrency Series: Concurrency Patterns(II)

2024年2月3日

Go Concurrency Series: Concurrency Patterns(II)

In our last post, we talked about the Worker Pool and Pipeline concurrency patterns, that we can use while designing…

1 条评论
Go Concurrency Series: Concurrency Patterns

2024年1月23日

Go Concurrency Series: Concurrency Patterns

Let’s continue being a little more hands-on in our Go Concurrency Series! In this post, we’ll look into the…

1 条评论
Go Concurrency Series: Deep Dive into Go Scheduler(III)

2024年1月20日

Go Concurrency Series: Deep Dive into Go Scheduler(III)

In my previous posts in the Go Concurrency Series, I’ve gone into the different components of the Go Scheduler and…
Go Concurrency Series: Deep Dive into Go Scheduler(II)

2024年1月14日

Go Concurrency Series: Deep Dive into Go Scheduler(II)

In my last post, we covered the components inside the Go Scheduler, and how a Go Scheduler can orchestrate the…

1 条评论
Go Concurrency Series: Deep Dive into Go Scheduler(I)

2024年1月4日

Go Concurrency Series: Deep Dive into Go Scheduler(I)

In my last post about Goroutines, we talked about how Goroutines differ from Traditional threads. The Go Runtime…

8 条评论
Go Concurrency Series: Introduction to Goroutines

2023年12月25日

Go Concurrency Series: Introduction to Goroutines

Concurrency is a fundamental concept in modern software development, enabling programs to handle multiple tasks…

4 条评论

See all articles

Consistency & Consistency Levels(II) — Distributed Data Stores

Pratik Pandey

Senior Software Engineer at Booking.com | AWS Serverless Community Builder | pratikpandey.substack.com

领英推荐

Partial Writes Scenario -

Distributed Systems Made Easy

7,968 位关注者

Pratik Pandey的更多文章

社区洞察

其他会员也浏览了

Keeping An Eye On Your Data

SEEOcta Data: Big Data – How Data can Generate Revenue

How Synerise simplifies data processing for businesses

Enhance Your Data Governance with These 7 Unconventional Data Observability Use Cases

Ensuring Data Consistency Across Multiple Systems

What is the difference between data silos and data domains?

Data Governance-Adding value to the New Currency

From Silos to Synergy: Real-time data sharing across eco-systems in real time

"Why Data Observability is the Key to Optimizing Your Data Pipeline at Every Step ! "

Unlock the Benefits of a Data Mesh for Your Tech Organization

领英推荐

Partial Writes Scenario -

Distributed Systems Made Easy

7,968 位关注者

Pratik Pandey的更多文章

Database Intermediate Series: Change Data Capture(II)

Database Intermediate Series: Change Data Capture(I)

Database Intermediate Series: SQL Isolation Levels Internals

Database Basics Series: Understanding SQL Isolation Levels

Go Concurrency Series: Concurrency Patterns(II)

Go Concurrency Series: Concurrency Patterns

Go Concurrency Series: Deep Dive into Go Scheduler(III)

Go Concurrency Series: Deep Dive into Go Scheduler(II)

Go Concurrency Series: Deep Dive into Go Scheduler(I)

Go Concurrency Series: Introduction to Goroutines

社区洞察

其他会员也浏览了

Keeping An Eye On Your Data

SEEOcta Data: Big Data – How Data can Generate Revenue

How Synerise simplifies data processing for businesses

Enhance Your Data Governance with These 7 Unconventional Data Observability Use Cases

Ensuring Data Consistency Across Multiple Systems

What is the difference between data silos and data domains?

Data Governance-Adding value to the New Currency

From Silos to Synergy: Real-time data sharing across eco-systems in real time

"Why Data Observability is the Key to Optimizing Your Data Pipeline at Every Step ! "

Unlock the Benefits of a Data Mesh for Your Tech Organization