登录查看更多内容

Making Postgres 30% Faster in Production

Thomas Said

Senior Software Engineer | Python | Backend | DevOps | Generative AI

发布日期: 2023年6月24日

PgCat Prepared Statements

PgCat supports prepared statements in session and transaction mode. Our initial benchmarks show a 30% increase over extended protocol (--protocol extended) and 15% against simple protocol (--simple). Most (all?) web frameworks use at least the extended protocol, so we are looking at a?30% performance increase across the board for everyone?who writes web apps and uses Postgres in production, by just switching to named prepared statements. In Rails apps, it's as simple as setting?prepared_statements: true.

This is not only a performance benefit, but also a usability improvement for client libraries that have to use prepared statements, like the popular Rust crate?SQLx. Until now, the typical recommendation was to just not use a pooler.

Benchmark

The benchmark was conducted using?pgbench?with 1, 10, 100 and 1000 clients sending millions of queries to PgCat, which itself was running on a different EC2 machine alongside the database. This is a simple setup often used in production. Another configuration sees a pooler use its own machine, which of course increases latency but improves on availability. The clients were on another EC2 machine to simulate the latency experienced in typical web apps deployed in Kubernetes, ECS, EC2 and others.

Benchmark ran in transaction mode. Session mode is faster with fewer clients, but does not scale in production with more than a few hundred clients. Only?SELECT?statements (-S?option) were used, since the typical?pgbench?benchmark uses a similar number of writes to reads, which is an atypical production workload. Most apps read 90% of the time, and write 10% of the time. Reads are where prepared statements truly shine.

Implementation

PgCat implements an internal cache & mapping between clients' prepared statements and servers that may or may not have them. If a server has the prepared statement, PgCat just forwards the?Bind (F),?Execute (F)?and?Describe (F)?messages. If the server doesn't have the prepared statement, PgCat fetches it from the client cache & prepares it using the?Parse (F)?message. You can refer to?Postgres docs?for a more detailed explanation of how the extended protocol works.

Learnk8s 1 个月前

Caching is not always in-memory

Shrey Batra 9 个月前

9 Backend Questions Every Big Tech Companies Asks

Eleke Great 6 个月前

An important feature of PgCat's implementation is that all prepared statements are renamed and assigned globally unique names. This means that clients that don't randomize their prepared statement names and expect it to be gone after they disconnect from the "Postgres server", work as expected (I put "Postgres server" in quotes because they are actually talking to a proxy that pretends to be a Postgres database). Typical error when using such clients with PgBouncer is?prepared statement "sqlx_s_2" already exists, which is pretty confusing when you see it for the first time.

Metrics

We've added two new metrics to the admin database:?prepare_cache_hit?and?prepare_cache_miss. Prepare cache hits indicate that the prepared statement requested by the client already exists on the server. That's good because PgCat can just rewrite the messages and send them to the server immediately. Prepare cache misses indicate that PgCat had to issue a prepared statement call to the server, which requires additional time and decreases throughput. In the ideal scenario, the cache hits outnumber the cache misses by an order of magnitude. If they are the same or worse, the prepared statements are not being used correctly by the clients.

Our benchmark had a 99.99% cache hit ratio, which is really good, but in production, this number is likely to be lower. You can monitor your cache hit/miss ratios through the admin database by querying it with?SHOW SERVERS.

Roadmap

Our implementation is pretty simple and we are already seeing massive improvements, but we can still do better. A?Parse (F)?made prepared statement works, but if one prepares their statements using?PREPARE?explicitly, PgCat will ignore it and that query isn't likely to work outside of session mode.

Another issue is explicit?DEALLOCATE?and?DISCARD?calls. PgCat doesn't detect them currently, and a client can potentially bust the server-prepared statement cache without PgCat knowing about it. It's an easy enough fix to intercept and act on that query accordingly, but we haven't built that yet.

Testing with?pgbench?is an artificial benchmark, which is good and bad. It's good because other things being equal, we can demonstrate that one implementation & configuration of the database/pooler cluster is superior to another. It's bad because, in the real world, the results can differ. We are looking for users who would be willing to test our implementation against their production traffic and tell us how we did. This feature is optional and can be enabled & disabled dynamically, without restarting PgCat, with?prepared_statements = true?in?pgcat.toml.

Thoughts

1,218 位关注者

要查看或添加评论，请登录

Thomas Said的更多文章

Don’t marry the framework!

2023年10月15日

Don’t marry the framework!

In effect, the author is asking you to marry the framework—to make a huge, long-term commitment to that framework. And…
Software Architecture: Component Cohesion Principles

2023年8月27日

Software Architecture: Component Cohesion Principles

We will discuss the principles of component cohesion: REP: The Reuse/Release Equivalence Principle CCP: The Common…
Becoming better with continuous learning

2023年8月5日

Becoming better with continuous learning

Continually absorb new knowledge and enhance your skill set to make better decisions in your work. Continuous learning…
Data Stores: Structured data VS Unstructured data

2023年7月13日

Data Stores: Structured data VS Unstructured data

Structured data stores Structured data stores have been around for decades and are the most familiar technology choice…
The Power of Using Generative AI in Automation

2023年7月2日

The Power of Using Generative AI in Automation

Artificial Intelligence (AI) has become a driving force behind automation, revolutionizing the way businesses operate…
The Power of Microservices: Principles and Architecture

2023年7月2日

The Power of Microservices: Principles and Architecture

In today's fast-paced and ever-evolving digital landscape, software development has undergone a transformative shift…

2 条评论
Designing Principles for operational excellence

2023年5月30日

Designing Principles for operational excellence

Operational excellence is running your application with minimal interruption to gain maximum business value. It is…
How to Plan to Migrate to the cloud?!

2023年3月12日

How to Plan to Migrate to the cloud?!

Creating a cloud migration strategy As we mentioned in the previous section, there could be various reasons for cloud…
How to build MVP like an architect?

2023年2月19日

How to build MVP like an architect?

For a successful solution, always put the customer first, while taking care of architectural constraints. Think…
Attributes of the Solution Architecture

2023年2月7日

Attributes of the Solution Architecture

The solution architecture needs to consider multiple attributes and design applications. Solution design may have a…

See all articles

Making Postgres 30% Faster in Production

Thomas Said

Senior Software Engineer | Python | Backend | DevOps | Generative AI

PgCat Prepared Statements

Benchmark

Implementation

领英推荐

Metrics

Roadmap

Thoughts

1,218 位关注者

Thomas Said的更多文章

社区洞察

其他会员也浏览了

Reinventing the Cube

Modern, Cloud Native Application Design

Scaling Up Series: Caching Techniques with Redis

What the fuss is this serverless? I want to get over it in next 5 minutes

Tips for building scalable and performant APIs

AWS Lambda (Points-to-remember)

Azure Weekly Updates - July 10th, 2023

AWS Lambda - Architecture & Components

AWS serverless services

PgCat Prepared Statements

Benchmark

Implementation

领英推荐

Metrics

Roadmap

Thoughts

1,218 位关注者

Thomas Said的更多文章

Don’t marry the framework!

Software Architecture: Component Cohesion Principles

Becoming better with continuous learning

Data Stores: Structured data VS Unstructured data

The Power of Using Generative AI in Automation

The Power of Microservices: Principles and Architecture

Designing Principles for operational excellence

How to Plan to Migrate to the cloud?!

How to build MVP like an architect?

Attributes of the Solution Architecture

社区洞察

其他会员也浏览了

Reinventing the Cube

Modern, Cloud Native Application Design

Scaling Up Series: Caching Techniques with Redis

What the fuss is this serverless? I want to get over it in next 5 minutes

Tips for building scalable and performant APIs

AWS Lambda (Points-to-remember)

Azure Weekly Updates - July 10th, 2023

AWS Lambda - Architecture & Components

AWS serverless services