Som's Tech blog

I plan to use this platform to share my knowledge, experiences, and technical expertise with the aim of motivating other software engineers. Whether you’re new to coding or an experienced developer looking for new ideas, I welcome you to join me in this journey of discovery and creativity.

Diving Deep into Mysql Transaction Isolation Levels

Imagine you have 300 INR in your bank account, and you have set up a pre-debit instruction with your bank for 200 INR. At the same time, you decide to make a 200 INR payment to a merchant using a UPI transaction while shopping at a grocery store. Now, the question arises: Will both transactions be successful, or will one of them fail due to insufficient balance?

The latter outcome is bound to occur, and this is solely due to the application of ACID properties by the software’s transactional database.

Having four key properties – atomicity, consistency, isolation, and durability – ensures the timely completion of a database transaction. When a database possesses these properties, it is considered ACID-compliant.

Below we will examine Database Isolation property and we will be diving deep into Mysql transaction isolation levels.

What are database transaction isolation levels? What are they used for?

When dealing with database transactions, a vital consideration is selecting the suitable isolation level for our application. Despite a well-defined standard, different database engines may opt to implement it differently, resulting in varying behaviours for each isolation level.

Today, our focus will delve into a comprehensive exploration of how each isolation level operates in MySQL, accomplished through the execution of specific SQL queries. However, before delving into practical examples, let’s first discuss the theoretical aspects of transaction isolation levels.

Isolation theory

As we are aware, a database transaction must adhere to the ACID property, encompassing Atomicity, Consistency, Isolation, and Durability.

Isolation represents one of the fundamental properties of a database transaction, and when achieved at its highest level, it ensures that concurrent transactions do not impact each other. However, in reality, various ways exist through which a transaction can be influenced by other transactions running concurrently, leading to what we term as read phenomena.

Let’s delve into various observable occurrences that might arise when a database functions with reduced levels of transaction isolation.

Dirty read phenomenon:

The “Dirty read” phenomenon occurs when a transaction reads data written by another concurrent transaction that has not been committed yet. This situation is highly undesirable because we cannot guarantee whether the other transaction will ultimately be committed or rolled back. Consequently, we risk using incorrect data if the transaction is eventually rolled back.

Non-repeatable read phenomenon:

The second phenomenon that could be experienced is referred to as the “non-repeatable read,” which happens when a transaction reads a particular record on two occasions and notices distinct values during each instance. The reason for this inconsistency is that another transaction modified the row and committed its changes after the first read took place.

Phantom read phenomenon:

Phantom read is a phenomenon similar to “non-repeatable read”, but it affects queries searching for multiple rows instead of just one. In this scenario, the same query is re-executed, but a different set of rows is returned due to changes made by other recently committed transactions. These changes could include inserting new rows or deleting existing rows that happen to satisfy the search condition of the current transaction’s query.

Serialization anomaly:

Another phenomenon that occurs when dealing with a group of transactions is the serialization anomaly. This anomaly occurs when the outcome of a series of concurrent transactions that have been successfully executed cannot be replicated by sequentially running them in any order without any overlap.

Don’t worry if you don’t fully grasp these phenomena right now. We are going to replicate each of them in MySQL to better understand their effects and implications.

Diving Deep into Mysql Transaction Isolation Levels

To address these phenomena, the American National Standard Institute (ANSI) defined four standard transaction isolation levels.

Diving Deep into Mysql Transaction Isolation Levels

The lowest isolation level is read uncommitted, where transactions can see data written by other uncommitted transactions, allowing the dirty read phenomenon to occur.

The next level is read committed, which is a bit more stringent. Here, transactions can only access data that has been committed by other transactions, making dirty read impossible.

Taking a step beyond is the repeatable-read isolation level, which ensures that the identical select query will unfailingly yield the same outcome, no matter how many instances it is executed, even if other concurrent transactions have committed new changes that meet the query’s criteria.

Lastly, the highest isolation level is serializable. In this level, concurrent transactions are ensured to yield the same result as if they were executed sequentially in a specific order without overlapping. Essentially, it means there exists at least one way to order these concurrent transactions so that when executed one by one, the final result remains consistent.

Okay, it’s time to establish the relationship between isolation levels and the observed read phenomena. I am running MySQL in my Docker container and opening a MySQL session in one terminal window. If I execute the query “select @@transaction_isolation;” the default transaction isolation level shows as REPEATABLE READ.

I’ll change it to READ UNCOMMITTED by running the query “set session transaction isolation level read uncommitted;”

Kindly be aware that this modification will exclusively impact all forthcoming transactions within the ongoing session, while transactions in a separate MySQL console session will remain unaffected. So, I’ll open another session and perform the same activity, setting its transaction isolation level to read uncommitted.

Now both of my sessions have a transaction isolation level of read uncommitted. In the first session, I execute “select * from savings_account;” and it displays the following result:

Next, I update Amit’s balance to 4800 by executing the query “update savings_account set balance = balance-100 where id = 1;” So, in the first session, Amit’s balance is 4800, and please note that transaction 1 in the first session is not committed yet.

In the second session, if I run “select * from savings_account where id = 1;” what should it display, 5000 or 4800? Well, in this case, it will display 4800, as we are using the transaction isolation level READ UNCOMMITTED, which allows us to read data that is not yet committed (dirty read).

However, if we set the isolation level of the second session as READ COMMITTED, then the dirty read can be avoided, and the result of the second session will be 5000, since transaction 1 was not committed. Therefore, the read-committed isolation level safeguards against the occurrence of the dirty read phenomenon.

How about non-repeatable and phantom read?

In transaction 2, let’s initiate another transaction and execute a select query on the ‘savings_accounts’ table where the balance is greater than or equal to 4500 INR. The result obtained is as follows:

Moving on to session 1, we will start another transaction and update Amit’s balance by deducting 500 INR from it, followed by committing the transaction. As a result, Amit’s balance in session 1 becomes 4300 INR. Now, if we run the same select query in session 2, the result will differ from before.

This difference in results is due to the effect of the committed transaction, known as the phantom-read phenomenon. Hence, it becomes evident that the read-committed isolation level effectively averts dirty reads while still permitting non-repeatable read and phantom-read phenomena.

Now, I will set this session’s transaction isolation level to repeatable read by rolling back the previous update. Then, I will start new transactions in both session 1 and session 2. Moving to session 2, we will check if it can read the new changes made by transaction 1. We will run another select query on ‘savings_accounts’ where the balance is greater than or equal to 4500 INR. The below result will be produced since I rolled back the previous update from the table.

Now, within the transaction of session 1, let’s update Amit’s balance again by deducting 500 INR from it. Amit’s effective balance now stands at 4300 INR. We will commit the transaction in session 1. In session 2, within the existing transaction 2, we will run the same select query ‘select * from savings_account where balance > 4500.’ The result obtained will be as follows:

The select query returns the old version of Amit’s savings account, with a balance of 4800 INR, even though transaction 1 had changed it to 4300 INR and successfully committed the change. This behavior is due to the repeatable-read isolation level, which ensures that all read queries return the same result, regardless of changes made by other committed transactions. Hence, the phantom-read phenomenon is also prevented in this isolation level.

However, I am curious about what will happen if we execute the update query (balance – 500) to change Amit’s balance in session 2’s transaction 2. Will the balance become 4300, 3800, or will it throw an error? Let’s try it!

There is no error. Now, let’s run a select query to get the updated savings account details.

Now, Amit’s balance is 3800 INR, which is the correct value because transaction 1 had already committed the change that modified the balance to 4300 INR. However, from the perspective of transaction 2, it seems inconsistent. The last select query showed a balance of 4800 INR for Amit, but after subtracting 500 INR from the account, it now shows 3800 INR. This inconsistency arises due to concurrent updates from other transactions interfering with this transaction.

In my opinion, it would make more sense for MySQL to refuse the change by raising an error in such cases to ensure transaction data consistency. Perhaps, Postgres handles this type of concurrent updates differently in this isolation level which we will again examine later in someday.

For now, let’s roll back this transaction and proceed to the highest isolation level (serializable) to see if it can prevent this issue. Let’s start two sessions and two transactions with the serializable isolation level. In transaction 1, we will select all accounts, while in transaction 2, we will select only account 1. Next, we’ll go back to transaction 1 and subtract an additional 500 INR from Amit’s balance. Interestingly, the update query is blocked this time.

The reason for this blockage is that in the serializable isolation level, MySQL implicitly converts all plain SELECT queries to SELECT FOR SHARE. A transaction that holds SELECT FOR SHARE only allows other transactions to READ the rows but not UPDATE or DELETE them. This locking mechanism prevents the inconsistent data scenario we observed before. However, this lock has a timeout duration. If the second transaction does not commit or rollback to release the lock within that duration, a ‘lock wait timeout exceeded’ error will occur. Therefore, in such cases, we must implement a transaction retry strategy when using the serializable isolation level.

Let’s summarise the transaction isolation levels in MySQL. At the lowest level, “read uncommitted,” all four phenomena can occur. Moving up to “read committed,” it only prevents dirty reads, leaving the other three phenomena still possible. The “repeatable read” level in MySQL addresses the first three phenomena: dirty read, non-repeatable read, and phantom read, but it may still encounter serialization anomalies and inconsistent concurrent updates At its pinnacle, the “serializable” level stands as the most stringent, effectively inhibiting all four phenomena due to its robust locking mechanism.

When using a high isolation level, it is crucial to be aware of the potential for errors, timeouts, or even deadlocks. Therefore, it is essential to implement a careful retry mechanism for transactions. Additionally, keep in mind that different database engines might implement transaction isolation levels differently. Always refer to the official documentation for MySQL and test it thoroughly before implementing your code.

I will include the links to the official documentation about transaction isolation levels in MySQL for your reference. I hope this information has been helpful to you. Thank you for reading, and I look forward to sharing more insights in future write-ups!

Official documentation of MySql transaction Isolation level:  https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html

Leave a Comment