DB Isolation Levels in Detail - Part -1

DB Isolation Levels in Detail - Part -1

If two transactions don't touch the same data, they can safely run in parallel because they are not dependent on each other. Concurrency issue only comes when either one transaction is modifying the data while another transaction is reading it or both the transaction updating the same data simultaneously.

But as the number of users grows, applications become heavier (or in-demand) many users trying to access/update the records. To support such demand weak isolation levels come into the picture. Understanding this is very important as it provides ‘I’ in ACID. And in this article, we are going to understand them in detail. There are 4 types of isolations::

  • READ UNCOMMITTED
  • READ COMMITTED
  • REPEATABLE READ / SNAPSHOT ISOLATION
  • SERIALIZABLE

In this post, I am trying to explain each one of them in detail. This post is part of multiple series.

I am using MySQL for demonstration purposes. Let's first set auto-commit to 0, and check default isolation level. Repeatable Read isolation level is the default isolation level in MySQL. Let's try to understand each one of them one by one.

mysql> select @@transaction_isolation

+-------------------------+

| @@transaction_isolation |

+-------------------------+

| REPEATABLE-READ ? ? ? ? |

+-------------------------+

1 row in set (0.00 sec)


mysql> select @@autocommit;

+--------------+

| @@autocommit |

+--------------+

|? ? ? ? ? ? 1 |

+--------------+

1 row in set (0.00 sec)


mysql> set autocommit=0;

Query OK, 0 rows affected (0.00 sec)


mysql> select @@autocommit;

+--------------+

| @@autocommit |

+--------------+

|? ? ? ? ? ? 0 |

+--------------+

1 row in set (0.00 sec)        

READ UNCOMMITTED

This type of isolation doesn’t provide any isolation when 2 transactions(T1 and T2) run in parallel. It allows dirty read and dirty write.

Dirty read:???

This occurs when one transaction reads the data that is modified by another transaction but not yet committed. So consider an example. Let’s change the isolation level for this test:

mysql> set session transaction isolation level read uncommitted

Query OK, 0 rows affected (0.00 sec)



mysql> select @@transaction_isolation;

+-------------------------+

| @@transaction_isolation |

+-------------------------+

| READ-UNCOMMITTED? ? ? ? |

+-------------------------+

1 row in set (0.00 sec)        

Current values in DB:

mysql> select * from account

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

3 rows in set (0.00 sec)        

Now, let’s consider T1 and T2 are different transactions running on different terminals. Just to confirm if we are using a different connection or not:

T1

mysql> SELECT CONNECTION_ID();

+-----------------+

| CONNECTION_ID() |

+-----------------+

| ? ? ? ? ? ? ? 9 |

+-----------------+

1 row in set (0.00 sec)


mysql> select @@transaction_isolation;

+-------------------------+

| @@transaction_isolation |

+-------------------------+

| READ-UNCOMMITTED? ? ? ? |

+-------------------------+

1 row in set (0.00 sec)        

Start transaction

mysql> begin

Query OK, 0 rows affected (0.00 sec)


mysql> select * from account;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)        

T2:

mysql> SELECT CONNECTION_ID();

+-----------------+

| CONNECTION_ID() |

+-----------------+

| ? ? ? ? ? ? ? 9 |

+-----------------+

1 row in set (0.00 sec)



mysql> set session transaction isolation level read uncommitted;

Query OK, 0 rows affected (0.00 sec)



mysql> select @@transaction_isolation;

+-------------------------+

| @@transaction_isolation |

+-------------------------+

| READ-UNCOMMITTED? ? ? ? |

+-------------------------+

1 row in set (0.00 sec)        

Start Transaction

mysql> begin;

Query OK, 0 rows affected (0.00 sec)



mysql> select * from account;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)        

Till now we have the same value in both the terminals. Let's modify some data in one of the terminals:

T2:

mysql> update? account set opening_balance = opening_balance - 100 where acnumber = 'A0004'

Query OK, 1 row affected (0.00 sec)

Rows matched: 1? Changed: 1? Warnings: 0


mysql> select * from account;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 1900 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)        

Let's check in the T1 terminal now

mysql> select * from account

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 1900 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.01 sec)        

So, you see T1 reads the uncommitted data from T2. This is called dirty reads. Now, what happens when for some reason T1 rollback the changes but T2 already reads it and performed some action on it. This will create data issues.

Let’s understand what dirty write is?

When multiple transactions are working on the same row, it is expected is the last transaction should overwrite anything written before it. But what happens when T1 updated data but have not committed yet but another transaction T2 that started after that has updated the row and committed the transaction. Now, when T1 is committing it, it will overwrite T2 data which is wrong. This is called dirty writing.

READ COMMITTED

We have seen what issue we can face if we use READ UNCOMMITTED. To overcome these challenges let’s understand what guarantees Read committed provides:

  1. When reading from the DB, you will only see the data that has been committed (no dirty reads).
  2. When writing to the DB, you will only overwrite the data that has been committed (no dirty writes).

This type of isolation is very popular and is a default setting for Oracle 11g, PostgreSQL etc… Let’s try with the same queries we ran in previous isolation, and see what happens in this case:

T1

mysql> set session transaction isolation level read committed

Query OK, 0 rows affected (0.00 sec)



mysql> select @@transaction_isolation;

+-------------------------+

| @@transaction_isolation |

+-------------------------+

| READ-COMMITTED? ? ? ? ? |

+-------------------------+

1 row in set (0.00 sec)


mysql> begin;

mysql> select * from account;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)        

T2

mysql>? set session transaction isolation level read committed

Query OK, 0 rows affected (0.00 sec)




mysql> begin;

mysql> select @@transaction_isolation;

+-------------------------+

| @@transaction_isolation |

+-------------------------+

| READ-COMMITTED? ? ? ? ? |

+-------------------------+

1 row in set (0.00 sec)




mysql> select * from account;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)        

Now let’s see how Read committed isolation saves us from Dirty Read.

T1

mysql> update account set opening_balance = opening_balance - 100 where acnumber = 'A0004'

Query OK, 1 row affected (0.00 sec)

Rows matched: 1? Changed: 1? Warnings: 0


mysql> select * from account;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 1900 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)        

T2

mysql> select * from account

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)        

Do you see what has happened? T1 made some changes but has not committed a transaction, meantime when T2 is reading it, it only reads the committed transaction. It is not reading the value that T1 updated because that is not committed. So this way we are saved from Dirty Reads.

Let’s understand how this isolation saves us from Dirty Writes. Let’s say T2 is also trying to update the same row that T1 just updated.

T2

mysql>? update account set opening_balance = opening_balance - 100 where acnumber = 'A0004'

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction        

So, we have just seen, T2 transaction is timeout because MySQL took a lock on that row. Hence prevented it from updating it. Now no other transaction can update the same row without a T1 commit or rollback. Now, just to validate our point, let’s try to update some other rows through T2:

mysql>? update account set opening_balance = opening_balance - 100 where acnumber = 'A0002'

Query OK, 1 row affected (0.01 sec)

Rows matched: 1? Changed: 1? Warnings: 0



mysql> select * from account;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 400 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.01 sec)        

You see another row is updated which confirms our point. We can also check through Engine logs to confirm the transaction lock.

mysql>? SHOW ENGINE INNODB STATUS

------------

TRANSACTIONS

------------

Trx id counter 2356

Purge done for trx's n:o < 2354 undo n:o < 0 state: running but idle

History list length 0

LIST OF TRANSACTIONS FOR EACH SESSION:

---TRANSACTION 281480560380264, not started

0 lock struct(s), heap size 1128, 0 row lock(s)

---TRANSACTION 281480560379472, not started

0 lock struct(s), heap size 1128, 0 row lock(s)

---TRANSACTION 2355, ACTIVE 689 sec

2 lock struct(s), heap size 1128, 4 row lock(s), undo log entries 1

MySQL thread id 15, OS thread handle 6193934336, query id 204 localhost root starting

SHOW ENGINE INNODB STATUS

---TRANSACTION 2354, ACTIVE 706 sec

2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1

MySQL thread id 14, OS thread handle 6192820224, query id 199 localhost root        

So above logs confirmed that the lock has been taken.?So just for our curiosity, how does Read Committed isolation implemented this?

For this DB do 2 things:

  1. To support read (no dirty operations), DB maintains 2 copies of the data(new and old data) for every object that is written by the transaction that holds the write lock. Whenever data is read by other transaction(s) while a transaction is ongoing, it simply returns the old copy of the data till a new value is committed.
  2. To support writing, DB takes a row-level lock whenever a transaction wants to modify that row. Only one transaction can hold the lock at any given row.

So far so good but does it have some issues as well? Why not every DB just uses it and why do we have other isolation levels as well. Well, you guess it right, although it is very popular, it also has some issues. That issue is Non Repeatable read. This anomaly occurs when the result of the select statement is not the same. I.e. We run select queries multiple times and each time we get different rows? (or different values). Generally occurs when we use an update query. Consider the following example:

T1

mysql> begin

Query OK, 0 rows affected (0.00 sec)

?

mysql> select * from account;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)        

T2

mysql> begin

Query OK, 0 rows affected (0.00 sec)



mysql> select * from account where opening_balance >= 500;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)



mysql> select * from account where opening_balance >= 500;

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

2 rows in set (0.00 sec)        

Here we are trying to select rows that have an opening balance greater than or equal to 500. So in T2, we see 2 rows. Now, let’s commit the T1 and see what happens:

T1

mysql> commit

Query OK, 0 rows affected (0.00 sec)        

T2

mysql> select * from account where opening_balance >= 500

+----------+--------+-------+-----------------+------------+--------+---------+

| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |

+----------+--------+-------+-----------------+------------+--------+---------+

| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |

+----------+--------+-------+-----------------+------------+--------+---------+

1 row in set (0.01 sec)        

So you see we get only 1 row as a result. We were expecting 2 rows but now we are getting only 1 row. So, the previous run selected the query, and this time we get different results. This is called Non-Repeatable Read problem. But what problem it may cause, can’t we just use them? Well, we have some cases where this may cause problems:

  1. Backup: So if we see different values, it may cause problems when we are doing backup. Backup we may store different values and during the recovery, we get corrupted data.
  2. Analytical query: Consider during a time when analytical queries run, they may see different values and that again causes wrong data.

So far we have discussed 2 isolation levels and let's discuss more how to solve issues we face with Read committed isolation and other types of isolation in another post. Let me know your thoughts as well.

Reference:

  1. Designing data-intensive application book

Rajkumar Raigonde

AWS Cloud, Java, nodejs

2 年

Informative .. thanks

要查看或添加评论,请登录

Tushar Goel的更多文章

  • DB Isolation Levels Part -2 (Repeatable Read and Serializable)

    DB Isolation Levels Part -2 (Repeatable Read and Serializable)

    In the last article, we learned about Read un-committed and Read committed isolation levels and the problems linked…

  • How Kafka is so efficient?

    How Kafka is so efficient?

    What is Apache Kafka? Apache Kafka is a distributed streaming platform that allows you to:: Publish and subscribe to…

  • Dependency Inversion

    Dependency Inversion

    Dependency ingestion is ‘D’ in the SOLID design principle. It is a guideline that helps design loosely coupled classes.

  • Law of Demeter (Principle of least knowledge)

    Law of Demeter (Principle of least knowledge)

    In the last article we have discussed Tell, Don’t ask guideline. In this article, we will discuss another guideline…

  • Tell, Don't ask!

    Tell, Don't ask!

    Few days back I was having a discussion with one of my friends and we were discussing how to reduce coupling between…

  • Concurrency Pattern: Active object pattern

    Concurrency Pattern: Active object pattern

    This pattern is a type of concurrency design pattern and widely used in applications. Also, it is used to create…

  • JavaScript: DOM Manipulation

    JavaScript: DOM Manipulation

    In this article, I will explain how to do DOM manipulation using pure JavaScript. We have many other libraries…

  • The JavaScript EventLoop

    The JavaScript EventLoop

    This post will explain one of the most important concepts of JavaScript i.e.

  • How to use an index in MongoDB?

    How to use an index in MongoDB?

    In this post, I will explain how to use an index in MongoDB. In an earlier article, I explained how indexes work.

    3 条评论

社区洞察

其他会员也浏览了