Deep dive into Cloud databases- Transactions & ACID

This post continues our journey into the greatest abstraction provided to app developers over a data system - transaction. In our previous discussion we started discussing some of the guarantees that transaction provides to an app developer interacting with a data system. In this discussion we formalize the guarantees that are provided and do a deep dive into atomicity and consistency. This post is simultaneously also published to substack

The guarantees of ACID

As we discussed in the previous post, a transaction is a potentially long running program running on a data system with some guarantees. Usually meant for reading and writing data to tables, some data systems can provide a more powerful transactional abstraction such as modifying database objects such as tables, indices etc.

Leaving aside the “feature” differences between transactions of various data systems, the core guarantees provided remain more or less the same. Formally we can categorize the guarantee provided by a transaction as follows-

  1. Atomicity : Either the entire program will be executed (commit) or none of it (abort), we guarantee there won’t be any partial execution.
  2. Consistency : The program will leave the database in a consistent state regardless whether it was committed or aborted.
  3. Isolation: Multiple programs can run on the database system reading/writing to the same data without the need to explicitly synchronize amongst themselves. Each program acts as if it owns the entire database and is completely oblivious of any other transactions.
  4. Durability: Once the transaction commits, its effects are guaranteed to be persisted.

Let’s do a deeper analysis of the 4 guarantees so we understand them in a bit more detail. In the following posts we will do a deep dive into the failure models and touch upon the implementation details of each of the guarantees

Atomicity

Atomicity literally means “all or none”. That is either the transaction program runs to completion as a single block -


or it is forced to abort either by the DB system or by the application itself. In case of an abort all the intermediate state generated by the program must be cleaned up, that is, all the updates made to the database need to be reverted and the database needs to be reset to that pristine state it was in before the transaction started running.


This begs the question - “why would a transaction need to abort? After all most of the Java/Python/etc programs I write are executed in one shot and my computer never aborts them in the middle”

The answer lies in the inherent characteristic of a DB transaction and how it differs from a normal Python program running on your laptop-

  1. A transaction interacts with a complex data system running over a distributed cluster of machines. As with any distributed system, infrastructure failures are common and a transaction might be aborted for no fault of its own.
  2. A DB has multiple transactions acting on its data each without any knowledge of the others. In the case that 2 transactions running concurrently try to update the same data items we need to commit only one of them and abort the rest. This property will be covered more deeply under isolation in the future posts.
  3. A transaction crunches over large amounts of data (GBs, TBs) and is potentially long running. It might so happen that the application program that issued the transaction might not wait for so long in some cases and itself issues the abort.
  4. The end user issuing the transaction might discover a problem in the middle of the transaction and want to abort the program. Same case if there was a bug in the program.

In any case, if the transaction aborts then there needs to be a complete cleanup of the mess created by it so far.


Consistency

A transaction, irrespective of whether it was committed or aborted, should always leave the database in a “consistent state”. Now “consistent state” is a very abused term when it comes to database transactions and different vendors have varying definitions of consistency. I categorize consistency into 2 buckets-

  1. Logical consistency : With logical consistency we mean the database is consistent with respect to the constraints it offers. eg.-
  2. Physical consistency : Over the lifecycle of a write operation there might be multiple data structures updated such as the core storage (B-Tree, LSM tree or flat file housing the data) or the various index structures (index, cache, bloom filters) that power the reads. With physical consistency we mean that the transaction, even if aborted, doesn’t leave the various data structures and metadata of the DB in an inconsistent state. In case of transaction aborts we must safely remove all the data and metadata across multiple such systems.

All DBs provide physical consistency guarantees but logical consistency guarantees vary from vendor to vendor. Note that providing some of the advanced logical consistency guarantees such as foreign key constraints might put an additional burden on the transaction manager leading to a lower throughput. So for ensuring a higher TPS (transactions per second) its useful to provide lower guarantees for logical consistency.



Dinesh Belwalkar

Distributed Systems, Networking, SaaS Applications

3 个月

Good work

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了