Making Postgres 30% Faster in Production

Making Postgres 30% Faster in Production

PgCat Prepared Statements

PgCat supports prepared statements in session and transaction mode. Our initial benchmarks show a 30% increase over extended protocol (--protocol extended) and 15% against simple protocol (--simple). Most (all?) web frameworks use at least the extended protocol, so we are looking at a?30% performance increase across the board for everyone?who writes web apps and uses Postgres in production, by just switching to named prepared statements. In Rails apps, it's as simple as setting?prepared_statements: true.

This is not only a performance benefit, but also a usability improvement for client libraries that have to use prepared statements, like the popular Rust crate?SQLx. Until now, the typical recommendation was to just not use a pooler.

Benchmark

The benchmark was conducted using?pgbench?with 1, 10, 100 and 1000 clients sending millions of queries to PgCat, which itself was running on a different EC2 machine alongside the database. This is a simple setup often used in production. Another configuration sees a pooler use its own machine, which of course increases latency but improves on availability. The clients were on another EC2 machine to simulate the latency experienced in typical web apps deployed in Kubernetes, ECS, EC2 and others.

Benchmark ran in transaction mode. Session mode is faster with fewer clients, but does not scale in production with more than a few hundred clients. Only?SELECT?statements (-S?option) were used, since the typical?pgbench?benchmark uses a similar number of writes to reads, which is an atypical production workload. Most apps read 90% of the time, and write 10% of the time. Reads are where prepared statements truly shine.

Implementation

PgCat implements an internal cache & mapping between clients' prepared statements and servers that may or may not have them. If a server has the prepared statement, PgCat just forwards the?Bind (F),?Execute (F)?and?Describe (F)?messages. If the server doesn't have the prepared statement, PgCat fetches it from the client cache & prepares it using the?Parse (F)?message. You can refer to?Postgres docs?for a more detailed explanation of how the extended protocol works.

An important feature of PgCat's implementation is that all prepared statements are renamed and assigned globally unique names. This means that clients that don't randomize their prepared statement names and expect it to be gone after they disconnect from the "Postgres server", work as expected (I put "Postgres server" in quotes because they are actually talking to a proxy that pretends to be a Postgres database). Typical error when using such clients with PgBouncer is?prepared statement "sqlx_s_2" already exists, which is pretty confusing when you see it for the first time.

Metrics

We've added two new metrics to the admin database:?prepare_cache_hit?and?prepare_cache_miss. Prepare cache hits indicate that the prepared statement requested by the client already exists on the server. That's good because PgCat can just rewrite the messages and send them to the server immediately. Prepare cache misses indicate that PgCat had to issue a prepared statement call to the server, which requires additional time and decreases throughput. In the ideal scenario, the cache hits outnumber the cache misses by an order of magnitude. If they are the same or worse, the prepared statements are not being used correctly by the clients.

Our benchmark had a 99.99% cache hit ratio, which is really good, but in production, this number is likely to be lower. You can monitor your cache hit/miss ratios through the admin database by querying it with?SHOW SERVERS.

Roadmap

Our implementation is pretty simple and we are already seeing massive improvements, but we can still do better. A?Parse (F)?made prepared statement works, but if one prepares their statements using?PREPARE?explicitly, PgCat will ignore it and that query isn't likely to work outside of session mode.

Another issue is explicit?DEALLOCATE?and?DISCARD?calls. PgCat doesn't detect them currently, and a client can potentially bust the server-prepared statement cache without PgCat knowing about it. It's an easy enough fix to intercept and act on that query accordingly, but we haven't built that yet.

Testing with?pgbench?is an artificial benchmark, which is good and bad. It's good because other things being equal, we can demonstrate that one implementation & configuration of the database/pooler cluster is superior to another. It's bad because, in the real world, the results can differ. We are looking for users who would be willing to test our implementation against their production traffic and tell us how we did. This feature is optional and can be enabled & disabled dynamically, without restarting PgCat, with?prepared_statements = true?in?pgcat.toml.

要查看或添加评论,请登录

Thomas Said的更多文章

  • Don’t marry the framework!

    Don’t marry the framework!

    In effect, the author is asking you to marry the framework—to make a huge, long-term commitment to that framework. And…

  • Software Architecture: Component Cohesion Principles

    Software Architecture: Component Cohesion Principles

    We will discuss the principles of component cohesion: REP: The Reuse/Release Equivalence Principle CCP: The Common…

  • Becoming better with continuous learning

    Becoming better with continuous learning

    Continually absorb new knowledge and enhance your skill set to make better decisions in your work. Continuous learning…

  • Data Stores: Structured data VS Unstructured data

    Data Stores: Structured data VS Unstructured data

    Structured data stores Structured data stores have been around for decades and are the most familiar technology choice…

  • The Power of Using Generative AI in Automation

    The Power of Using Generative AI in Automation

    Artificial Intelligence (AI) has become a driving force behind automation, revolutionizing the way businesses operate…

  • The Power of Microservices: Principles and Architecture

    The Power of Microservices: Principles and Architecture

    In today's fast-paced and ever-evolving digital landscape, software development has undergone a transformative shift…

    2 条评论
  • Designing Principles for operational excellence

    Designing Principles for operational excellence

    Operational excellence is running your application with minimal interruption to gain maximum business value. It is…

  • How to Plan to Migrate to the cloud?!

    How to Plan to Migrate to the cloud?!

    Creating a cloud migration strategy As we mentioned in the previous section, there could be various reasons for cloud…

  • How to build MVP like an architect?

    How to build MVP like an architect?

    For a successful solution, always put the customer first, while taking care of architectural constraints. Think…

  • Attributes of the Solution Architecture

    Attributes of the Solution Architecture

    The solution architecture needs to consider multiple attributes and design applications. Solution design may have a…

社区洞察

其他会员也浏览了