DB Isolation Levels in Detail - Part -1
Tushar Goel
Lead Member Of Technical Staff @ Salesforce | Staff Software Engineer, Technical Lead
If two transactions don't touch the same data, they can safely run in parallel because they are not dependent on each other. Concurrency issue only comes when either one transaction is modifying the data while another transaction is reading it or both the transaction updating the same data simultaneously.
But as the number of users grows, applications become heavier (or in-demand) many users trying to access/update the records. To support such demand weak isolation levels come into the picture. Understanding this is very important as it provides ‘I’ in ACID. And in this article, we are going to understand them in detail. There are 4 types of isolations::
In this post, I am trying to explain each one of them in detail. This post is part of multiple series.
I am using MySQL for demonstration purposes. Let's first set auto-commit to 0, and check default isolation level. Repeatable Read isolation level is the default isolation level in MySQL. Let's try to understand each one of them one by one.
mysql> select @@transaction_isolation
+-------------------------+
| @@transaction_isolation |
+-------------------------+
| REPEATABLE-READ ? ? ? ? |
+-------------------------+
1 row in set (0.00 sec)
mysql> select @@autocommit;
+--------------+
| @@autocommit |
+--------------+
|? ? ? ? ? ? 1 |
+--------------+
1 row in set (0.00 sec)
mysql> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select @@autocommit;
+--------------+
| @@autocommit |
+--------------+
|? ? ? ? ? ? 0 |
+--------------+
1 row in set (0.00 sec)
READ UNCOMMITTED
This type of isolation doesn’t provide any isolation when 2 transactions(T1 and T2) run in parallel. It allows dirty read and dirty write.
Dirty read:???
This occurs when one transaction reads the data that is modified by another transaction but not yet committed. So consider an example. Let’s change the isolation level for this test:
mysql> set session transaction isolation level read uncommitted
Query OK, 0 rows affected (0.00 sec)
mysql> select @@transaction_isolation;
+-------------------------+
| @@transaction_isolation |
+-------------------------+
| READ-UNCOMMITTED? ? ? ? |
+-------------------------+
1 row in set (0.00 sec)
Current values in DB:
mysql> select * from account
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
3 rows in set (0.00 sec)
Now, let’s consider T1 and T2 are different transactions running on different terminals. Just to confirm if we are using a different connection or not:
T1
mysql> SELECT CONNECTION_ID();
+-----------------+
| CONNECTION_ID() |
+-----------------+
| ? ? ? ? ? ? ? 9 |
+-----------------+
1 row in set (0.00 sec)
mysql> select @@transaction_isolation;
+-------------------------+
| @@transaction_isolation |
+-------------------------+
| READ-UNCOMMITTED? ? ? ? |
+-------------------------+
1 row in set (0.00 sec)
Start transaction
mysql> begin
Query OK, 0 rows affected (0.00 sec)
mysql> select * from account;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
T2:
mysql> SELECT CONNECTION_ID();
+-----------------+
| CONNECTION_ID() |
+-----------------+
| ? ? ? ? ? ? ? 9 |
+-----------------+
1 row in set (0.00 sec)
mysql> set session transaction isolation level read uncommitted;
Query OK, 0 rows affected (0.00 sec)
mysql> select @@transaction_isolation;
+-------------------------+
| @@transaction_isolation |
+-------------------------+
| READ-UNCOMMITTED? ? ? ? |
+-------------------------+
1 row in set (0.00 sec)
Start Transaction
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from account;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
Till now we have the same value in both the terminals. Let's modify some data in one of the terminals:
T2:
mysql> update? account set opening_balance = opening_balance - 100 where acnumber = 'A0004'
Query OK, 1 row affected (0.00 sec)
Rows matched: 1? Changed: 1? Warnings: 0
mysql> select * from account;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 1900 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
Let's check in the T1 terminal now
mysql> select * from account
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 1900 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.01 sec)
So, you see T1 reads the uncommitted data from T2. This is called dirty reads. Now, what happens when for some reason T1 rollback the changes but T2 already reads it and performed some action on it. This will create data issues.
Let’s understand what dirty write is?
When multiple transactions are working on the same row, it is expected is the last transaction should overwrite anything written before it. But what happens when T1 updated data but have not committed yet but another transaction T2 that started after that has updated the row and committed the transaction. Now, when T1 is committing it, it will overwrite T2 data which is wrong. This is called dirty writing.
READ COMMITTED
We have seen what issue we can face if we use READ UNCOMMITTED. To overcome these challenges let’s understand what guarantees Read committed provides:
This type of isolation is very popular and is a default setting for Oracle 11g, PostgreSQL etc… Let’s try with the same queries we ran in previous isolation, and see what happens in this case:
领英推荐
T1
mysql> set session transaction isolation level read committed
Query OK, 0 rows affected (0.00 sec)
mysql> select @@transaction_isolation;
+-------------------------+
| @@transaction_isolation |
+-------------------------+
| READ-COMMITTED? ? ? ? ? |
+-------------------------+
1 row in set (0.00 sec)
mysql> begin;
mysql> select * from account;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
T2
mysql>? set session transaction isolation level read committed
Query OK, 0 rows affected (0.00 sec)
mysql> begin;
mysql> select @@transaction_isolation;
+-------------------------+
| @@transaction_isolation |
+-------------------------+
| READ-COMMITTED? ? ? ? ? |
+-------------------------+
1 row in set (0.00 sec)
mysql> select * from account;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
Now let’s see how Read committed isolation saves us from Dirty Read.
T1
mysql> update account set opening_balance = opening_balance - 100 where acnumber = 'A0004'
Query OK, 1 row affected (0.00 sec)
Rows matched: 1? Changed: 1? Warnings: 0
mysql> select * from account;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 1900 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
T2
mysql> select * from account
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
Do you see what has happened? T1 made some changes but has not committed a transaction, meantime when T2 is reading it, it only reads the committed transaction. It is not reading the value that T1 updated because that is not committed. So this way we are saved from Dirty Reads.
Let’s understand how this isolation saves us from Dirty Writes. Let’s say T2 is also trying to update the same row that T1 just updated.
T2
mysql>? update account set opening_balance = opening_balance - 100 where acnumber = 'A0004'
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
So, we have just seen, T2 transaction is timeout because MySQL took a lock on that row. Hence prevented it from updating it. Now no other transaction can update the same row without a T1 commit or rollback. Now, just to validate our point, let’s try to update some other rows through T2:
mysql>? update account set opening_balance = opening_balance - 100 where acnumber = 'A0002'
Query OK, 1 row affected (0.01 sec)
Rows matched: 1? Changed: 1? Warnings: 0
mysql> select * from account;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 400 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.01 sec)
You see another row is updated which confirms our point. We can also check through Engine logs to confirm the transaction lock.
mysql>? SHOW ENGINE INNODB STATUS
------------
TRANSACTIONS
------------
Trx id counter 2356
Purge done for trx's n:o < 2354 undo n:o < 0 state: running but idle
History list length 0
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 281480560380264, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 281480560379472, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 2355, ACTIVE 689 sec
2 lock struct(s), heap size 1128, 4 row lock(s), undo log entries 1
MySQL thread id 15, OS thread handle 6193934336, query id 204 localhost root starting
SHOW ENGINE INNODB STATUS
---TRANSACTION 2354, ACTIVE 706 sec
2 lock struct(s), heap size 1128, 1 row lock(s), undo log entries 1
MySQL thread id 14, OS thread handle 6192820224, query id 199 localhost root
So above logs confirmed that the lock has been taken.?So just for our curiosity, how does Read Committed isolation implemented this?
For this DB do 2 things:
So far so good but does it have some issues as well? Why not every DB just uses it and why do we have other isolation levels as well. Well, you guess it right, although it is very popular, it also has some issues. That issue is Non Repeatable read. This anomaly occurs when the result of the select statement is not the same. I.e. We run select queries multiple times and each time we get different rows? (or different values). Generally occurs when we use an update query. Consider the following example:
T1
mysql> begin
Query OK, 0 rows affected (0.00 sec)
?
mysql> select * from account;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
T2
mysql> begin
Query OK, 0 rows affected (0.00 sec)
mysql> select * from account where opening_balance >= 500;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
mysql> select * from account where opening_balance >= 500;
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0002? ? | C0002? | B0002 | ? ? ? ? ? ? 500 | 2012-06-12 | Saving | Active? |
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
2 rows in set (0.00 sec)
Here we are trying to select rows that have an opening balance greater than or equal to 500. So in T2, we see 2 rows. Now, let’s commit the T1 and see what happens:
T1
mysql> commit
Query OK, 0 rows affected (0.00 sec)
T2
mysql> select * from account where opening_balance >= 500
+----------+--------+-------+-----------------+------------+--------+---------+
| acnumber | custid | bid ? | opening_balance | aod? ? ? ? | atype? | astatus |
+----------+--------+-------+-----------------+------------+--------+---------+
| A0004? ? | C0004? | B0004 |? ? ? ? ? ? 2000 | 2020-09-11 | Saving | Active? |
+----------+--------+-------+-----------------+------------+--------+---------+
1 row in set (0.01 sec)
So you see we get only 1 row as a result. We were expecting 2 rows but now we are getting only 1 row. So, the previous run selected the query, and this time we get different results. This is called Non-Repeatable Read problem. But what problem it may cause, can’t we just use them? Well, we have some cases where this may cause problems:
So far we have discussed 2 isolation levels and let's discuss more how to solve issues we face with Read committed isolation and other types of isolation in another post. Let me know your thoughts as well.
Reference:
AWS Cloud, Java, nodejs
2 年Informative .. thanks