POSTGRES Planner not using Index Scan even when it'll be more efficient???
Michael Amadi
Snr Database Reliability Engineer | Platforms Engineering | ClickHouse Expert | IAC & Automation | CI-CD | Distributed Systems | Lean Six Sigma Green Belt | Tech. Finance
Couple of days back, a friend reached out and needed my input on a particularly challenging issue on their Postgres production database. It was a peculiar one though. It happened that a query that is meant to generate a regulatory report was taking longer than usual to run and would still fail with the popular Postgress error: . On Further checks, the query indeed had covering indexes; also analyzing the query showed that an Index Scan would be more effective, however, the planner kept using a sequential scan. limiting the output size also had no effect. runing an Explain Analyze showed an exponential increase in resources and cost during a particular operation: the date(date_created) between 'yyyy-mm-dd' and 'yyyy-mm-dd' where projection in the query. the date_created column had an index, so what was the problem??
It's a well known fact that Postgres boasts of a handful of several interesting and effective types of Indexes: B-Tree, Hash, Generalized Search Tree (GiST), Generalized Inverted Index(GIN), Partial Indexes, Block Range Indexes (BRIN), etc. But choosing the right kind of indexes is an even more critical factor.
the date_created column has a column type of datetime2, and our query was trying to using the date() function, whereas, we should have created an expression index on the date_created column using the statement CREATE INDEX <index_name> ON tbl_name(date(date_created)) . So a possible fix would've been to create the Expression Index on the column, or alternatively rewrite the query to use the original column format, On rewriting the query stating the condition as where date_created between 'yyyy-mm-dd xx.xx.xx.xxx' and 'yyyy-mm-dd xx.xx.xx.xxx' , maintaining it's original structure, solved the issue, as the Postgres Planner could effectively use the previously created index, reducing run time to a whooping 2.44 secs. yep! you saw that right! just wow!!!
So next time, we're designing our Reporting tables or maintaining it, be sure to keep this in mind, relative to the reporting queries we intend to run. Cheers.
Database Administrator
3 年Nice write-up and troubleshooting MICHAEL AMADI , issues like these are common across different database platforms, where a query is written without a review of the underlying table structure.