How Transaction Id wraparound in PostgreSQL can cause a database outage
Problem Statement - Transaction Id Wraparound
Transaction ID wraparound can cause a significant issue in PostgreSQL database that can lead to data corruption if not managed properly.
In PostgreSQL, every transaction is assigned a unique transaction ID (XID), which is a 32-bit unsigned integer. This means the XID can take on values from 0 to 4,294,967,295 (2^32 - 1), roughly 4 billion. After all the ID(s) are exhausted, the counter for transactionId wraps around back to 0. This is known as transaction ID wraparound.
How Wraparound can cause an Issue in PostgreSQL
When the transaction ID counter wraps around, older transaction IDs can be mistaken for newer ones due to their numerical value being lower after the wraparound. For example, if the XID counter wraps from 4,294,967,295 back to 0, transactions that were assigned XIDs before the wraparound (e.g., 4,294,967,290) can appear newer than transactions assigned XIDs immediately after the wraparound (e.g., 10). This can lead to serious issues with data visibility and consistency.
Data Corruption Risk
If PostgreSQL reuses transaction IDs without proper handling, old data might appear as new. This can corrupt the database by making very old rows look like they were just inserted.
Prevention
To prevent data corruption, PostgreSQL relies on the VACUUM process to "freeze" old transaction IDs. Freezing means marking old tuples with a special transaction ID that indicates it is very old and should not be considered new even if the transaction ID counter wraps around. Let's discuss this shortly.
What's the problem then ?
The process called VACUUM which keeps in check that these Older Id's are available again ahead of time for the new transactions so that when upper limit of 4.2 billion is reached we still have enough runway once the wraparound happens.
Several activities and conditions in PostgreSQL can delay or ignore the VACUUM process and potentially halt database operations for reads and writes. Understanding these scenarios is crucial for maintaining database performance and ensuring the smooth execution of VACUUM operations.
VACUUM
Vacuum processing is a maintenance process of PostgreSQL. Its two main tasks are?removing dead tuples?and the?freezing transaction ids. Dead or unused tuples should be removed from the pages, this process of defragmentation is required to reclaim disk space in order to provide a contiguous space for the new rows/tuples.
Dead Tuples
In PostgreSQL, dead tuples are rows that have been deleted or updated but have not yet been physically removed from the table. These tuples are created because PostgreSQL uses a Multi version Concurrency Control (MVCC) system to handle concurrent transactions without blocking other transaction reading the same record at the same time, allowing for efficient concurrent reads and writes. So PostgreSQL does not support in-place updates of the rows, instead it creates a new version of the row with the updated data and this is how it supports MVCC.
Operations contributing to dead tuples :
How PostgreSQL handles transactionId wraparound issue
Postgres Tuple layout and the role of transaction Ids
Before we dive into the working of VACUUM and how it handles transaction wraparound. Let's discuss the PostgreSQL's heap tuple(row) structure.
Let's focus on the tuple header for a while, the description of the header fields in given in the table below :
The?xmin?and?xmax?values are hidden from daily operations, but we can still fetch them :
SELECT *, xmin, xmax FROM <table>;
The image below shows how xmin and xmax values are assigned to the tuple while creating and updating a tuple respectively.
When you use SELECT ... FOR UPDATE, PostgreSQL acquires an exclusive lock on the rows that are returned by the query. So that no other transaction can update these rows until the lock is released by the current transaction.
Process for Transaction Id wraparound
As Postgres keep on creating new versions of the tuples enforcing MVCC, it keep on adding more and more rows to the heap table, which generates a lot of dead tuples and other records that are intentionally deleted. This concurrency control mechanism requires some maintenance process in place to defragment the disk space and reutilising the transaction Ids.
This maintenance process is called VACUUM. It has the following responsibilities :
1. Removing dead tuples from the index and heap pages.
2. Update statistics for the transactions and query planner.
3. Freezing Old Transaction Ids
The scope of this document is to cover the 3rd point Freezing Old Transaction Ids, which helps in smooth operation of the transaction Id wraparound and avoiding wraparound failures.
Circular Transaction ID (XID) Space
Example with Current XID = 100
Understanding Older and Newer Ranges
1. Current XID: 100
2. Halfway Point: 2^31 = 2,147,483,648
领英推荐
Step-by-Step Calculation
Given current_xid = 100:
Older Range Calculation
Older range starts from current_xid - 2^31:
Older range ends at current_xid - 1:
Newer Range Calculation
Newer range starts from current_xid + 1:
Newer range ends at current_xid + 2^31:
Now since you have got the idea that for the current XID what is considered past/older transaction Ids and what is considered as new range of transaction Ids as we are moving around the circular space of the 32-bit transaction Id. Let's see how the past transaction Ids in the tuples are marked FROZEN so that they are available for future use again.
- In PostgreSQL a special reserved transaction Id (2) is used as a frozen_txid. When we need to mark the older rows as frozen, the x_min of the record is updated with the frozen_txid (2)
- Marking the record's t_xmin with 2 signifies that this XID is always older than all other txids. Simply put, the frozen txid is always inactive and visible to all the transactions.
Freeze processing
The freeze process is a part of VACUUM. Whenever VACUUM is triggered it scans all table files and goes through all the pages where the records need to he frozen and rewrites the t_xmin of tuples to the frozen id (2), if the existing t_xmin is older than the current txid minus the `vacuum_freeze_min_age` (default is 50 million).
In versions 9.4 or later, the XMIN_FROZEN bit is set to the t_infomask field of tuples rather than rewriting the t_xmin of tuples to the frozen txid.
Freeze processing is done in the following ways:
1. At the time of concurrent VACUUM
2. At the time when certain criteria is met
3. While running manual VACUUM
At the time of concurrent VACUUM
With concurrent VACUUM, pages which has only dead tuples are processed. To make the process run faster, VM (visibility map) of the target table is used. By checking the bit against a page in this mapping, it can identify whether the page has dead tuples or not.
An id called freezeLimit_txid is calculated. Every records having t_xmin less than freezeLimit_txid is freezed.
freezeLimit_txid = (OldestXmin - vacuum_freeze_min_age)
OldestXmin is the oldest transaction among the currently running transactions at the time this freeze processing got triggered.
`vacuum_freeze_min_age` is a configuration parameter and the default value is 50,000,000.
At the time when certain criteria is met
This condition relies on config parameter `vacuum_freeze_table_age`. This parameter sets the age at which the entire table is forcibly vacuumed to ensure tuples are frozen. Default value for vacuum_freeze_table_age is 150 million transactions.
When the age of the oldest non-frozen XID in a table exceeds vacuum_freeze_table_age, PostgreSQL will force a vacuum operation on that table to ensure tuples are frozen. This helps prevent wraparound issues by ensuring old tuples are marked as frozen before they become problematic.
Manual Freezing
You can explicitly run VACUUM FREEZE to freeze all tuples in a table, regardless of their age. When running a manual VACUUM in PostgreSQL, the age of each tuple (or row) is determined by the transaction IDs (XIDs) associated with the tuple and the current transaction. This age is then compared to the vacuum_freeze_min_age parameter to decide if the tuple should be frozen.
VACUUM FREEZE my_table;
Why can't the XID be converted to 64-bit INT to have a higher range
There are multiple challenges associated with this, it needs a bigger change across the database, may be in future we might see a bigger datatype for XID :
How to avoid issues with transaction Id wraparound
Case Studies : Outage because of failure of wraparound
References :
I'm passionate about database internals, constantly learning and compiling my insights to deepen my understanding. I enjoy sharing my knowledge with others to help them make aware of these concepts.
Subscribe my newsletter and Join me on this journey of uncovering the fascinating world of databases !