Deep dive into Cloud databases- Transactions & ACID
This post continues our journey into the greatest abstraction provided to app developers over a data system - transaction. In our previous discussion we started discussing some of the guarantees that transaction provides to an app developer interacting with a data system. In this discussion we formalize the guarantees that are provided and do a deep dive into atomicity and consistency. This post is simultaneously also published to substack
The guarantees of ACID
As we discussed in the previous post, a transaction is a potentially long running program running on a data system with some guarantees. Usually meant for reading and writing data to tables, some data systems can provide a more powerful transactional abstraction such as modifying database objects such as tables, indices etc.
Leaving aside the “feature” differences between transactions of various data systems, the core guarantees provided remain more or less the same. Formally we can categorize the guarantee provided by a transaction as follows-
Let’s do a deeper analysis of the 4 guarantees so we understand them in a bit more detail. In the following posts we will do a deep dive into the failure models and touch upon the implementation details of each of the guarantees
Atomicity
Atomicity literally means “all or none”. That is either the transaction program runs to completion as a single block -
or it is forced to abort either by the DB system or by the application itself. In case of an abort all the intermediate state generated by the program must be cleaned up, that is, all the updates made to the database need to be reverted and the database needs to be reset to that pristine state it was in before the transaction started running.
领英推荐
This begs the question - “why would a transaction need to abort? After all most of the Java/Python/etc programs I write are executed in one shot and my computer never aborts them in the middle”
The answer lies in the inherent characteristic of a DB transaction and how it differs from a normal Python program running on your laptop-
In any case, if the transaction aborts then there needs to be a complete cleanup of the mess created by it so far.
Consistency
A transaction, irrespective of whether it was committed or aborted, should always leave the database in a “consistent state”. Now “consistent state” is a very abused term when it comes to database transactions and different vendors have varying definitions of consistency. I categorize consistency into 2 buckets-
All DBs provide physical consistency guarantees but logical consistency guarantees vary from vendor to vendor. Note that providing some of the advanced logical consistency guarantees such as foreign key constraints might put an additional burden on the transaction manager leading to a lower throughput. So for ensuring a higher TPS (transactions per second) its useful to provide lower guarantees for logical consistency.
Distributed Systems, Networking, SaaS Applications
3 个月Good work