登录查看更多内容

Version Vector(III)

Pratik Pandey

Senior Software Engineer at Booking.com | AWS Serverless Community Builder | pratikpandey.substack.com

发布日期: 2022年8月28日

In my last?article, we looked at one of the techniques of identifying concurrent updates/conflicts by leveraging Server(Replica) as an Actor & the advantages and disadvantages of doing so. If we went with the approach to include siblings in our Version Vectors to resolve conflicts, we still don’t have a way to track causality in the merged state. If we don’t include siblings in our Version Vectors, we can have data loss/updates. In this article, we’ll look at another approach for identifying concurrent updates/conflicts.

Dotted Version Vectors

Dotted Version Vectors solve the problems we saw with using ServerId. Let’s go over the problem with using ServerId -

Let’s try to understand what’s happening in the above diagram -

Let’s assume we have a key K, with value U. We’re assuming that we have an empty version vector, to begin with. Client’s C2 and C3 sync the same state from the Replica(Assuming all clients are interacting with the same replica) that’s implementing Version Vectors.
C2 updates the value to W & sends a PUT command, with the local state of Version Vector it has(empty VV).
C3 updates the value to V & sends a PUT command, with the local state of Version Vector it has(empty VV).
Replica A receives the request from C3 first(C2’s request might be delayed because of network latency). Replica A compares the Version Vector it received with its local state & sees that they match. So it increments the counter to 1 & updates the value to V. It also sends back the state to C3.
The request from C2 finally arrives at Replica A. Replica A compares the Version Vector it received with its local state & sees that the vector it received does not match its state. So it increments its counter to 2, and adds the value from C2 as a sibling to V. It also sends back the state to C2.
C2 updates the value to Z & sends a PUT command, with the local state of Version Vector it has which is (A, 1).
Replica A receives the request from C3 and compares the Version Vector it received with its local state & sees that they do not match. So it increments its counter to 3, and adds the value Z from C3 as a sibling to V,W.

Problem with Step 7 -

If you notice, the update to Z from V, came after client 3 saw the state of V and decided to update it to Z. However, the server/replica, treats the update as concurrent, rather than sequential and stores the value of Z as a sibling. Hence we end up losing information on causality & that’s because the server does not have context at K : {(A, 2)} : [W, V], the version vector associated with the occurrence of V i.e (A,1).

Solution

We understand that the problem is because we don’t have context that can help us determine, which siblings are causally related and which are concurrent. To solve the problem, we change the structure of what’s stored inside the Version Vector, to provide more context.

Current State — [(A, 2)], [V, W]

New State — [(A, 2), [(A, 1)->V, (A,2)->W]

Now, let’s apply our new state and see how to helps -

Let’s try to understand what’s happening in the above diagram, starting from step 5, where Replica has already received the 1st message from C3 -

The request from C2 arrives at Replica A. Replica A compares the Version Vector it received with its local state & checks compares each of the “dots” in its local state, to see if any could have been causally related to the incoming Version Vector. In this case, the local state of {A,1} is not related to {} & hence the event is treated as concurrent and its stored as a sibling. K : {(A, 2)} : [{(A, 2), W}, {(A, 1), V}].
C2 updates the value to Z & sends a PUT command, with the local state of Version Vector it has which is (A, 1).
Replica A receives the request from C3 and compares the Version Vector it received with its local state & checks compares each of the “dots” in its local state. It notices that it has a “Dot” that represents {A,1} and hence it knows that its not a concurrent update. Replica A then overwrites the value of V with Z and stores the state of Z along with the latest VV state.

Advantages -

We’re able to solve actor explosion in using Client as an Actor and are able to identify causality across events using “Dotted” Version Vectors.
Easy to understand most recent data based on the counter.

领英推荐

Decoding TLS 1.3 Protocol Handshake With Wireshark

Arun KL 2 年前

Windows Prefetch for Incident Responders

Taz Wake 1 年前

Exploring HTTP2 over HTTP1 protocol.

Sufyan Khan 1 年前

Disadvantages -

Storing additional metadata to provide more contexts introduces extra overheads in syncing the data across multiple replicas.

Conflict Resolutions

Irrespective of the type of Version Vectors we used, we saw that we still had conflicting cases, which we avoided by leveraging siblings. Siblings allow us to defer the conflict resolution and allows the system to move forward and also helps in keeping the state of the conflicting versions till the desired Conflict Resolution Strategy kicks in. We’ll talk about multiple conflict resolution strategies here -

Last Write Wins?-

We referred to Last-Write-Wins briefly during our?article. We generally leverage some parameter(generally timestamp) to decide our latest write and delete any siblings that occurred before the lastest write. The LWW is a great strategy for server side resolution(i.e the responsibility of doing conflict resolution doesn’t fall on the client), but suffers from inconsistent behaviours leading to data inaccuracy.

If you take into account multiple replicas, if multiple clients update the same data row concurrently, because of n/w latency or partition the last write will overwrite the value from the client & this behaviour will not be consistent.

Choose LWW for your systems preferably if you have low concurrent writes, or where you’re okay with the inconsistent behaviour.

2. Read Repair -

Read repair is a strategy where the VV conflict resolution is done at the time of reading the data. The node coordinating the read request fetches the VV of the data from all replica nodes having it, and then performs a merge on the Version Vectors.

Notice that read repair only targets doing conflict resolution for data that's actively being read. For data that's not actively touched, we can do a similar process and its called proactive repair.

There are other conflict resolution mechanism as well, but the idea was to provide all you guys with the high level details. The internals of conflict resolution will change based on the system’s choice of conflict resolution strategy.

-----------------------------

This brings us to the end of this article and the series on Version Vectors. Hopefully it helped you understand how concurrent updates in a distributed system are handled with the help of Version Vectors. We also talked about how Version Vectors evolved and the advantages and drawbacks of each evolved version. Please post comments on any doubts you might have and will be happy to discuss them! Also, if you want me to cover Conflict Resolution in detail, please comment and I’ll do the same!

-----------------------------

Thank you for reading! I’ll be posting weekly content on distributed systems & patterns, so please like, share and subscribe to this?newsletter?for notifications of new posts.

Please comment on the post with your feedback, will help me improve! :)

Until next time, Keep asking questions & Keep learning!

Distributed Systems Made Easy

7,968 位关注者

要查看或添加评论，请登录

Pratik Pandey的更多文章

Database Intermediate Series: Change Data Capture(II)

2024年5月29日

Database Intermediate Series: Change Data Capture(II)

Our previous post discussed Change Data Capture and how to implement it using triggers. In this post, we’ll explore how…

1 条评论
Database Intermediate Series: Change Data Capture(I)

2024年4月23日

Database Intermediate Series: Change Data Capture(I)

Change Data Capture (CDC) refers to identifying and capturing changes made to data in a database and then delivering…

2 条评论
Database Intermediate Series: SQL Isolation Levels Internals

2024年4月4日

Database Intermediate Series: SQL Isolation Levels Internals

In our last post, we talked about Database Isolation Levels and how different Isolation Levels allow us to balance the…

1 条评论
Database Basics Series: Understanding SQL Isolation Levels

2024年3月21日

Database Basics Series: Understanding SQL Isolation Levels

We are starting a new series on Databases, covering Basic, Intermediate, and Advanced concepts. This is the first…

6 条评论
Go Concurrency Series: Concurrency Patterns(II)

2024年2月3日

Go Concurrency Series: Concurrency Patterns(II)

In our last post, we talked about the Worker Pool and Pipeline concurrency patterns, that we can use while designing…

1 条评论
Go Concurrency Series: Concurrency Patterns

2024年1月23日

Go Concurrency Series: Concurrency Patterns

Let’s continue being a little more hands-on in our Go Concurrency Series! In this post, we’ll look into the…

1 条评论
Go Concurrency Series: Deep Dive into Go Scheduler(III)

2024年1月20日

Go Concurrency Series: Deep Dive into Go Scheduler(III)

In my previous posts in the Go Concurrency Series, I’ve gone into the different components of the Go Scheduler and…
Go Concurrency Series: Deep Dive into Go Scheduler(II)

2024年1月14日

Go Concurrency Series: Deep Dive into Go Scheduler(II)

In my last post, we covered the components inside the Go Scheduler, and how a Go Scheduler can orchestrate the…

1 条评论
Go Concurrency Series: Deep Dive into Go Scheduler(I)

2024年1月4日

Go Concurrency Series: Deep Dive into Go Scheduler(I)

In my last post about Goroutines, we talked about how Goroutines differ from Traditional threads. The Go Runtime…

8 条评论
Go Concurrency Series: Introduction to Goroutines

2023年12月25日

Go Concurrency Series: Introduction to Goroutines

Concurrency is a fundamental concept in modern software development, enabling programs to handle multiple tasks…

4 条评论

See all articles

Version Vector(III)

Pratik Pandey

Senior Software Engineer at Booking.com | AWS Serverless Community Builder | pratikpandey.substack.com

Dotted Version Vectors

Problem with Step 7 -

Solution

领英推荐

Conflict Resolutions

Distributed Systems Made Easy

7,968 位关注者

Pratik Pandey的更多文章

社区洞察

其他会员也浏览了

Exploring HTTP2 over HTTP1 protocol.

Using Auto-Increment ID for Stored Object or Record

Three reasons for 300 code in API

F5 Lab 1.14: Persistence Profile-Source and Destination Address

Protocol-Oriented Network in swift— Part 2

RFC: HTTP Wire Errors

"Top 10" HTTP Status Codes

How to Use the Nexus Cache Tree (and Why Search Capabilities Are Priceless)

A History of the Office Document Cache (ODC) [Part 10]

How to take IPTables trace inside a container

Dotted Version Vectors

Problem with Step 7 -

Solution

领英推荐

Conflict Resolutions

Distributed Systems Made Easy

7,968 位关注者

Pratik Pandey的更多文章

Database Intermediate Series: Change Data Capture(II)

Database Intermediate Series: Change Data Capture(I)

Database Intermediate Series: SQL Isolation Levels Internals

Database Basics Series: Understanding SQL Isolation Levels

Go Concurrency Series: Concurrency Patterns(II)

Go Concurrency Series: Concurrency Patterns

Go Concurrency Series: Deep Dive into Go Scheduler(III)

Go Concurrency Series: Deep Dive into Go Scheduler(II)

Go Concurrency Series: Deep Dive into Go Scheduler(I)

Go Concurrency Series: Introduction to Goroutines

社区洞察

其他会员也浏览了

Exploring HTTP2 over HTTP1 protocol.

Using Auto-Increment ID for Stored Object or Record

Three reasons for 300 code in API

F5 Lab 1.14: Persistence Profile-Source and Destination Address

Protocol-Oriented Network in swift— Part 2

RFC: HTTP Wire Errors

"Top 10" HTTP Status Codes

How to Use the Nexus Cache Tree (and Why Search Capabilities Are Priceless)

A History of the Office Document Cache (ODC) [Part 10]

How to take IPTables trace inside a container