[Postgres] How to redefine a PK without downtime
Nikolay Samokhvalov
?? Let's make your Postgres healthy now. Reach out to me: [email protected] // I stand with Ukraine ????
I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X , provide feedback, share!
Redefining a PK (primary key) is not a difficult procedure, yet it requires a few non-trivial steps to be done. This procedure also is a part of the int4->int8 PK conversion when using a "new column" method (to be discussed separately).
Of course, we can just drop a PK and define a new one like this:
alter table your_table
drop constraint your_table_pkey;
alter table your_table
add primary key (new_column);
But this straightforward approach is, in general, a terrible idea, because it will acquire an AccessExclusiveLock on the table and hold it for a long time because:
This is because to build a PK, we need two ingredients: a UNIQUE constraint, and NOT NULL on all columns participating in the PK definition. Fortunately, in modern Postgres (PG12+), it is possible to avoid long-lasting exclusive locks – in other words, to have truly "online" or "zero-downtime" operation.
Below we assume that:
Note, that the last condition is essential – unlike UKs (unique key), a PK requires all columns participating in its definition to have a NOT NULL constraint.
NOT NULL: many good and bad news (eventually, all good)
Let's dive into details here – NOT NULL deserves it. We'll have a bunch of good and bad news. We'll dive into specifics that are not necessarily related to PK, but they are still relevant. And eventually we'll return to the PK redefinition task. Just bear with me.
Bad news: unfortunately, adding a NOT NULL constraint to an existing column means that Postgres will need to perform a long (for large tables) full-table scan, during which it will an AccessExclusiveLock acquired by ALTER TABLE is going to be held. This is not what we want if we need zero-downtime operations.
Good news: since Postgres 11, we can execute a trick, if we need to add a column with NOT NULL – we can benefit from PG11's new feature, non-blocking DEFAULT for new columns, and we combine it with NOT NULL, for example:
alter table t1
add column new_id int8 not null default -1;
This is very fast, thanks to PG11's optimization of DEFAULT for new columns (it's "virtual" – no whole-table rewrite happens):
ability to avoid a table rewrite for ALTER TABLE ... ADD COLUMN with a non-null column default (PG11 release notes )
And since all rows are pre-filled ("virtually", but it doesn't matter), we can have NOT NULL right away, avoiding long wait.
Bad news: this works only for new columns. If we deal with an existing column, and still want to add a NOT NULL to it, this won't work.
Good news: if we just need a "not null", not matter how defined, we can use a CHECK constraint. The good thing about CHECK constraints is that their definition can be two-phase:
Bad news: since our final goal is a PK redefinition, the CHECK constraint won't work for us because PK requires a truly NOT NULL one.
Good news: in PG12+, there is an optimization that allows NOT NULL constraint definition to rely on an existing CHECK (... IS NOT NULL) constraint:
Allow ALTER TABLE ... SET NOT NULL to avoid unnecessary table scans (PG12 release notes )
This means that we just need to do this:
领英推荐
Interestingly, it is okay to skip step 3 here if our final goal is a PK creation – the NOT NULL constraint will be created implicitly, during PK creation; and it will be fast thanks to already existing CHECK (... NOT NULL).
UNIQUE
The second ingredient we need for a new PK creation is a UNIQUE constraint. Fortunately, it can be created in two phases, avoiding long-lasting exclusive locks:
create unique index concurrently new_unique_index
on your_table using btree(your_column);
The whole recipe
Now, let's complete the puzzle and see the whole picture.
Building a new PK in zero-downtime fashion consists of these five steps:
alter table your_table
add constraint your_table_your_column_check
check (your_column is not null) not valid;
alter table your_table
validate constraint your_table_your_column_check;
create unique index concurrently u_your_table_your_column
on your_table using btree(your_column);
alter table your_table
add constraint your_table_pkey primary key
using index u_your_table_your_column;
alter table your_table
drop constraint your_table_your_column_check;
(?? – recommended to use low lock_timeout and retries)
Marathon progress: ▓??????????????????? 9.04%
This series is also available in Markdown format: https://gitlab.com/postgres-ai/postgresql-consulting/postgres-howtos
Building @ Abnormal Security
4 周Came in super handy, thanks a lot!
Senior Data Architect | Data Enthusiast - Curious, proactive, and detail-oriented
10 个月Really amazing article !