登录查看更多内容

Multi-Version Concurrency Control

Vivek Bansal

Senior Software Engineer at Uber | Ex-Grab | Ex-Directi

发布日期: 2024年4月14日

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched article every week delivered straight to your inbox.

Why do we need Concurrency control?

What is the problem?

Let’s imagine that you are trying to transfer money(let’s say 100Rs) from your account to your friend’s account. At the database level, this transfer of money will be done via a database write transaction that debits money from your account and at the same time credits money to your friend’s account.

Imagine that while this transaction is in progress, to be more specific, consider the exact point in time ( let’s say time instance: T ) when the money has been debited from your account but not credited to your friend’s account. If someone tries to read your account balance and your friend’s account balance at the time instance T, it will appear to the reader as if money has disappeared from the bank, since it has left your account but not reached your friend’s account.

This problem arises due to inconsistent reads, which means that although the write transaction was not fully done i.e. crediting the 100Rs money to your friend’s account, other read transactions were allowed to read the inconsistent and uncommitted data from the database.

Thanks for reading Curious Engineer! Subscribe for free to receive new posts and support my work.

Subscribed

How do we solve concurrency issues?

To prevent these scenarios, we need concurrency control in our transactions. Concurrency control enables which data should or should not be visible to other transactions while reading and writing data into the database.

To solve our original problem of inconsistent reads, the simplest way here is to block all the readers until the writer is done which is usually referred to as the read-write lock. In simple terms, a read-write lock states:

“Multiple threads can read the data in parallel but an exclusive lock is needed for writing or modifying data.” In other words, when a writer thread is writing something, all other write threads and read threads must wait until the original writer is done writing.

However, locks are known to provide resource contention. Imagine if multiple writes come at the same time, then multiple write threads will be blocked until one of the write threads is writing the data.

So, the main question becomes: How can we provide maximum database concurrency while getting minimum lock contention?

Enters the solution: Multi-Version Concurrency Control

Multi-Version Concurrency Control (MVCC)

The premise of multi-version concurrency control is:

“When an MVCC database needs to update a piece of data, it will not overwrite the original data item with new data, but instead creates a newer version of the data item, thus storing multiple versions of the same data.“

MVCC tries to maximize database throughput by keeping multiple copies of each data item. Every client connected to the database sees a state (or snapshot) of the database as it was some time ago and then performs an update on top of it. Any changes made by a writer thread are not visible to other users until the changes are completed (as in, the transaction has been committed). This prevents transactions from viewing inconsistent data produced by concurrent transactions performing updates on the same data rows, providing transaction isolation for each database session.

Does MVCC need locks at all?

A good question at this juncture would be since we are storing multiple versions of the same data, then do we need locks at all if we choose multi-version concurrency control over traditional locks?

The answer is that multi-version concurrency control helps: reads not blocking writes and writes not blocking reads but MVCC still needs locks to function.

领英推荐

Blockchain for Enhanced Customer Trust in Financial…

Adria Business & Technology 2 个月前

Evaluating Ethereum Security Token Standards for…

ioBuilders 10 个月前

The Growing Importance Of Cybersecurity In Financial…

SSTech System 1 年前

Consider a case where two write transactions concurrently read data, try to modify the data, and then try to commit their changes. If both transactions are working on the same set of data, then only the latest changes made by a transaction would be visible and the other transaction changes would be lost forever. These are called Lost Updates.

So, in a nutshell, MVCC still needs locks, for example to prevent write conflicts.

Note: Multi-version concurrency control is a popular technique and is used in many databases. For example: PostgreSQL, CouchDB, MySQL with InnoDB storage, etc.

Snapshot Isolation Level

Unlike other traditional isolation levels present in databases, there is an isolation level called Snapshot isolation. A transaction that is operating under the snapshot isolation level will operate on a personal snapshot of the database which is taken at the time of starting the transaction.

When the transaction is concluded, it will successfully commit the changes only if the values updated by the transaction have not been changed by other transactions running concurrently since the time snapshot was taken. And, if there is a write-write conflict among transactions for any values, the transaction will be aborted.

Snapshot isolation level is implemented with multi-version concurrency control and is adopted by multiple databases including InterBase, Firebird, Oracle, MySQL, PostgreSQL, SQL Anywhere, MongoDB, and Microsoft SQL Server.

What is the Row-versioning technique?

Snapshot isolation uses a concept called row-versioning which is a database technique used to track changes made to individual rows in a table. It essentially creates a version history for each row, allowing the database to identify the current state and past states of the data. Here's a deeper look at how it works:

Each row in a table gets an additional column that stores a version number or timestamp. This column is often named rowversion (SQL Server) or uses system columns like xmin and xmax (PostgreSQL).
Whenever a row is inserted, updated, or deleted, the version information associated with that row is incremented or updated accordingly.

Row versioning is the secret sauce of how MVCC works, which allows multiple transactions to access and modify data concurrently with minimal locking. Transactions can read older versions of the data based on their version numbers.

The row-versioning technique is a very basic concept of the database and it helps in implementing Optimistic Concurrency Control as well.

Conclusion

The main advantage of using multi-version concurrency control rather than traditional locking( read-write lock ) is that MVCC locks acquired for reading do not block writing and MVCC locks acquired for writing do not block reading and thus provide high throughput out of your database.

That’s it, folks for this edition of the newsletter. Please consider liking and sharing with your friends as it motivates me to bring you good content for free. If you think I am doing a decent job, share this article in a nice summary with your network. Connect with me on Linkedin or Twitter for more technical posts in the future!

Book exclusive 1:1 with me here.

Thanks for reading Curious Engineer! Subscribe for free to receive new posts and support my work.

Resources

Multi-Version Concurrency Control by Wikipedia

Getting started with MVCC

Snapshot Isolation by Wikipedia

Curious Engineer

19,881 位关注者

Susanta Ghosh

10 个月

While this is excellent, I think whatever you have mentioned as snapshot isolation that's actually Serializable Snapshot Isolation [SSI] which provides prevention of Lost Updates while Snapshot Isolation provides prevention of Non Repeatable Reads.

1 次回应

查看更多评论

要查看或添加评论，请登录

Vivek Bansal的更多文章

How to implement a Circuit Breaker

2025年1月12日

How to implement a Circuit Breaker

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched…

7 条评论
How to implement Consistent Hashing

2024年12月29日

How to implement Consistent Hashing

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched…

3 条评论
Optimistic Locking Implementation

2024年12月1日

Optimistic Locking Implementation

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched…
1 year to Curious Engineer ??

2024年11月18日

1 year to Curious Engineer ??

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched…
Message Queues vs Message Brokers

2024年11月9日

Message Queues vs Message Brokers

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched…

4 条评论
Introduction to gRPC

2024年11月2日

Introduction to gRPC

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched…

7 条评论
Non-Functional Requirements

2024年10月26日

Non-Functional Requirements

Brief Introduction Let’s say you are building a website that allows users to book flight tickets. The requirements for…

4 条评论
QuadTrees

2024年10月19日

QuadTrees

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched…

2 条评论
Text Based Search: ElasticSearch

2024年10月12日

Text Based Search: ElasticSearch

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched…

3 条评论
Sharding vs Partitioning

2024年10月5日

Sharding vs Partitioning

If you like the free content I put out, consider subscribing to my newsletter on substack to get a well-researched…

5 条评论

See all articles

Multi-Version Concurrency Control

Vivek Bansal

Senior Software Engineer at Uber | Ex-Grab | Ex-Directi

Why do we need Concurrency control?

Multi-Version Concurrency Control (MVCC)

领英推荐

Snapshot Isolation Level

Conclusion

Resources

Curious Engineer

19,881 位关注者

Vivek Bansal的更多文章

社区洞察

其他会员也浏览了

Amplified Use of Blockchain in Financial Services

How DeFi improves the financial ecosystem

The Benefits of DeFi: Changing the Finance Game

The Rise of Decentralized Finance (DeFi): Blockchain’s Impact on the Financial Sector

Cryptio Newsletter | May 2023

Empowering Financial Services with Blockchain Technology: Enhancing Security and Transparency

Welcome to Taksh IT Solutions' Web3 Wallet Development Newsletter!

Financial Software Development Services

Why Blockchain is the Future of Financial Markets?

Unizen x LTO Network: Pioneering DeFi Integration for Traditional Finance

Why do we need Concurrency control?

Multi-Version Concurrency Control (MVCC)

领英推荐

Snapshot Isolation Level

Conclusion

Resources

Curious Engineer

19,881 位关注者

Vivek Bansal的更多文章

How to implement a Circuit Breaker

How to implement Consistent Hashing

Optimistic Locking Implementation

1 year to Curious Engineer ??

Message Queues vs Message Brokers

Introduction to gRPC

Non-Functional Requirements

QuadTrees

Text Based Search: ElasticSearch

Sharding vs Partitioning

社区洞察

其他会员也浏览了

Amplified Use of Blockchain in Financial Services

How DeFi improves the financial ecosystem

The Benefits of DeFi: Changing the Finance Game

The Rise of Decentralized Finance (DeFi): Blockchain’s Impact on the Financial Sector

Cryptio Newsletter | May 2023

Empowering Financial Services with Blockchain Technology: Enhancing Security and Transparency

Welcome to Taksh IT Solutions' Web3 Wallet Development Newsletter!

Financial Software Development Services

Why Blockchain is the Future of Financial Markets?

Unizen x LTO Network: Pioneering DeFi Integration for Traditional Finance