登录查看更多内容

How to troubleshoot and speed up Postgres stop and restart attempts

Nikolay Samokhvalov

?? Let's make your Postgres healthy and awesome: [email protected] // I stand with Ukraine ???? // Postgres.AI founder; PostgreSQL contributor; Postgres.FM co-host // Freediver

发布日期: 2023年9月28日

// This is #PostgresMarathon day 2 (originally posted on Twitter on 2023-09-27). Check out the previous post and the announcement as well.

Let's discuss #PostgreSQL shutdown and restarts. It's not uncommon to see quite long and even failed shutdown/restart attempts. If it's happening inside an incident troubleshooting, it often can provoke emotions and additional mistakes.

Some very popular reasons affecting the duration of the shutdown attempt:

There are long-running transactions.
A lot of buffers are dirty (changes are applied in memory but not yet synced to disk, waiting for another checkpoint), causing long shutdown checkpoint.
WAL archiving (archive_command) is lagging.

Below, we discuss each one of these reasons and how to mitigate them.

Reason 1: Long-running transactions

If you suspect the first reason, check if there are long-running transactions:

select
  clock_timestamp(),
  clock_timestamp() - xact_start,
  *
from pg_stat_activity
where clock_timestamp() - xact_start > interval '10 second'
order by clock_timestamp() - xact_start desc
limit 10;

In this case, it is usually recommended to use SIGINT (pg_ctl stop -m fast) – so-called, "fast shutdown" mode (Postgres docs).

Reason 2: Long shutdown checkpoint

The second reason – a lot of dirty buffers in the buffer pool – is less trivial but fortunately, is easy to mitigate. This is especially common when:

the buffer pool size (shared_buffers) is large
checkpoint tuning was performed in favor of fewer overheads of bulk random writes and fewer full-page writes (usually meaning that max_wal_size and checkpoint_timeout are increased)
the latest checkpoint happened quite long ago (can be seen in PG logs in log_checkpoint = on, which is recommended to have in most cases).

The amount of dirty buffers is quite easy to observe, using extension pg_buffercache (standard contrib modele) and this query (may take significant time; see the docs):

select count(*), pg_size_pretty(count(*) * 8 * 1024)
from pg_buffercache
where isdirty;

If the value is large (say, several GiB), at shutdown attempt, Postgres is going to perform so-called "shutdown checkpoint", flushing dirty buffers to disk (source code). During this, it won't be processing queries, which affects downtime. Mitigation is simple – an explicit CHECKPOINT right before shutdown/restart attempt:

checkpoint;

This is going to help us keep shutdown checkpoint very light, decreasing downtime.

In some cases, it may make sense to issue 2 explicit CHECKPOINTs in a row, right before the shutdown attempt: if the first CHECKPOINT is heavy, it also takes time, during which new dirty buffers are accumulated due to ongoing writes – and this we mitigate with the second CHECKPOINT, keeping the shutdown checkpoint very light and fast.

To simulate situation described here:

make sure shared_buffers is large (many GiB; changing it requires a restart)
increase max_wal_size and checkpoint_timeout (change doesn't require restart): say, '10GB' and '60min' (make sure there is enough disk space in the pg_wal subdirectory)
on a large table t1, perform: set statement_timeout = '60s'; begin; delete from t1; (it will be canceled, but lots of dirty buffers will be produced)
install pg_buffercache and check the number of dirty buffers as mentioned above
keep monitoring Postgres logs for checkpoint records (with log_checkpoint=on)

Then measure time of pg_ctl stop -m fast -- it is going to take significant time. Repeat the same steps with explicit CHECKPOINT right before the shutdown attempt.

This recommendation (explicit CHECKPOINT) is very important to follow in any automation involving shutdown or restart, where predictable time is needed. (For example, even if we don't aim to have minimal downtime when restarting Postgres, we still don't want it to take many minutes, potentially reaching some time limits in various places of our automation tooling.)

Reason 3: Failing/lagging archive_command

This is an unfortunate situation: pg_ctl stop -m fast disconnects ongoing session but it gives the WAL archiver a chance to archive pending WALs, which affects the shutdown attempt duration (the logic is complex and can be found in these files: postmaster.c, pgarch.c).

WAL archiving lag has to be considered a serious event because it affects the health of backups and RPO/RTO of DR. It has to be monitored and in case of significant lag (say, dozens of WALs are unarchived) it has to be considered an incident. Such monitoring should involve pg_stat_archiver and, ideally, information about current LSN to calculate the "lag" measured in bytes on WAL files. At least, last_failed_wal and last_failed_time should be covered by monitoring/alerting.

Interestingly, slow/failing archive_command can cause longer downtime during switchover/failover too – for example, this was a problem in Patroni versions older than 2.1.2, where behavior was fixed in favor of faster failover ("Release the leader lock when pg_controldata reports 'shut down'").

Summary:

use "fast mode" (pg_ctl stop -m fast) if you don't want existing sessions to end their work normally
always perform an explicit CHECKPOINT before shutdown or restart attempt
monitor pg_stat_archiver and ensure that pg_stat_archiver works without failures and lags.

That's it. Note that we didn't cover various timeouts (e.g., pg_ctl's option --timeout and wait behavior -w, -W, see Postgres docs) here and just discussed what can cause delays in shutdown/restart attempts.

Hope it was helpful - as usual, subscribe, like, share, and comment! ??

Youssef Tourki

1 年

Nikolay Samokhvalov Thank you for your amazing article ! very helpful content ! regarding pg_ctl stop, the documentation says it's fast by default , not smart

Márcia Laf?o

DBA - Tribunal Regional do Trabalho da 15a Regi?o

1 年

Genial Cleysson Lima! Obrigada por compartilhar!

Pascal CROZET

DBA multi SGBDR

1 年

Thanks Nicolay. It can help me with my job every day.

??? Ayoub A.

Freelance Software Developer with Python | Django | Fastapi | JavaScript | SQL (Think first, code later)

1 年

Insightful ?

查看更多评论

要查看或添加评论，请登录

Nikolay Samokhvalov的更多文章

How many TPS can we get from a single Postgres node?

2024年6月27日

How many TPS can we get from a single Postgres node?

1. Very close to 4M TPS! A few days ago, the Postgres.

9 条评论
How to perform initial / rough Postgres tuning

2023年12月29日

How to perform initial / rough Postgres tuning

I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X, provide feedback…

6 条评论
[Postgres] How to redefine a PK without downtime

2023年12月27日

[Postgres] How to redefine a PK without downtime

I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X, provide feedback…

2 条评论
[Postgres] How to speed up bulk load

2023年12月23日

[Postgres] How to speed up bulk load

I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X, provide feedback…

2 条评论
[Postgres] How to troubleshoot a growing pg_wal directory

2023年12月23日

[Postgres] How to troubleshoot a growing pg_wal directory

I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X, provide feedback…

1 条评论
[Postgres] How to deal with long-running transactions

2023年12月21日

[Postgres] How to deal with long-running transactions

I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X, provide feedback…

2 条评论
[Postgres] How to work with arrays, part 2

2023年12月3日

[Postgres] How to work with arrays, part 2

I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X, provide feedback…
[Postgres] How to work with arrays, part 1

2023年11月30日

[Postgres] How to work with arrays, part 1

I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X, provide feedback…

2 条评论
[Postgres] How to work with metadata

2023年11月21日

[Postgres] How to work with metadata

I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X, provide feedback…
How to use OpenAI APIs right from Postgres to implement semantic search and GPT chat

2023年11月20日

How to use OpenAI APIs right from Postgres to implement semantic search and GPT chat

I post a new PostgreSQL "howto" article every day. Join me in this journey – subscribe here or on X, provide feedback…

See all articles

Reason 1: Long-running transactions

Reason 2: Long shutdown checkpoint

Reason 3: Failing/lagging archive_command

Summary:

Nikolay Samokhvalov的更多文章

How many TPS can we get from a single Postgres node?

How to perform initial / rough Postgres tuning

[Postgres] How to redefine a PK without downtime

[Postgres] How to speed up bulk load

[Postgres] How to troubleshoot a growing pg_wal directory

[Postgres] How to deal with long-running transactions

[Postgres] How to work with arrays, part 2

[Postgres] How to work with arrays, part 1

[Postgres] How to work with metadata

How to use OpenAI APIs right from Postgres to implement semantic search and GPT chat