DEPLOY POSTGRESQL HA ON KUBERNETES
Jyothylakshmi K B
Independent Consultant (IT BusinessDevelopment) Helping companies in marketing their products/services, lead generation, social media campaigns, connecting with potential customers and partners to expand their business.
1. Introduction
This article is based on the latest webinar by?Cloud Control Solutions?on?How to Deploy PostgreSQL HA on Kubernetes, which covers the importance of High Availability (HA) for software systems, the basics of PostgreSQL DBMS, and its deployment to containers and?Kubernetes, AppZ PostgreSQL HA Architecture and its salient features.?
There are multiple tools and technologies available to deploy PostgreSQL in HA mode. But, AppZ PostgreSQL HA Stack is a unique solution developed by?Cloud Control?Solutions, which ensures 99.999 (Five Nines) availability.
2. What is High Availability (HA)?
Availability refers to the percentage of time a system is operational for use without any failures in a given time. A measure of availability is critical as every?enterprise?wants its system and services to be available 24/7 for its customers worldwide.?Downtime?can lead to huge losses for companies engaged in many critical operations. Availability is usually expressed in Nines (9). For example, five-nines availability means that the system is available 99.999% of the time, which is less than 6 minutes of?downtime?in a year.?
Achieving High Availability requires the underlying system to be designed to eliminate any single point of failure. Such HA designs ensure continuous operation or uptime for extended time periods.?HA?Architectures are built using redundant hardware, software, and network combinations, so that even if one component fails, the backup component will resume operations without downtime. In addition,?HA Architectures?ensures that the system is configured to withstand different loads and can recover failures with no or minimal?downtimes.
High availability?is essential for any enterprise that wants to protect its business against the risks caused by system outages and resultant revenue loss.
?Some of the best practices to be followed while designing and implementing?HA?systems are,
3. PostgreSQL – The open source DBMS
PostgreSQL?is one of the most commonly used free and open source object-relational database management systems (DBMS) used in many mission-critical applications by small, medium, and large enterprises. It is one of the most advanced, versatile, and well-documented database systems.?PostgreSQL?is ranked 4th?in the overall DBMS ranking by DB-Engines in 2022.?PostgreSQL?was also ranked as the second most used and loved?DBMS?by developers in a survey conducted by stack overflow.?
PostgreSQL?runs on different OS platforms, and it supports ANSI SQL, stored procedures, triggers, ACID, transactions, extension functions, replication, etc. It is highly reliable, offering robust performance.?PostgreSQL?has been supported by an active development community over the last 30 years.
4. Containers and Kubernetes
Containers offer compelling advantages over virtualization technologies. Containers are lightweight, resource-efficient, faster, portable, highly scalable, and easier to manage. Containers increase developer productivity and time to market.?
While containers are the way to bundle and run applications,?Kubernetes?is for managing these containers that run applications and ensuring that there is no down time. For example, when a container fails,?Kubernetes?detects it and starts another one.?Kubernetes?is one of the most commonly used orchestration platforms for managing containerized workloads and services.?
According to CNCF Annual Report 2021, container adoption and Kubernetes have truly gone mainstream – usage has risen across organizations globally, particularly in large businesses. According to?CNCF’s?respondents, 96% of organizations are either using or evaluating?Kubernetes?– a record high since our surveys began in 2016.
5. PostgreSQL HA Architectures
There are multiple?PostgreSQL?HA architectures available with varying levels of sophistication.?
5.1 Log Shipping Replication?
In this HA Architecture, one primary and one or more standby?PostgreSQL?servers work together in continuous archiving/recovery modes. Log files (WAL – Write Ahead Log files) from the primary are shipped to the secondary and restored there continuously. Log shipping is asynchronous, and there is a possibility of data loss when the primary fails. During the primary node failure, if any log files are left unshipped from primary to secondary, that data may get lost. Log Shipping architecture is simple to implement and less costly, but does not ensure data availability.?
5.2 Streaming Replication
PostgreSQL?has built-in streaming replication support from version 9 onwards. In streaming replication, the secondary node/replica connects to the primary and continuously receives a stream of WAL records. In this case, replicas will be more up-to-date compared with Log Shipping Replication. Replica nodes act as a hot standby and can also be used to serve the users for their read requests. Streaming replication by default is asynchronous, and there are chances for data loss in case of a primary node failure. This is because the primary does not wait for a confirmation from the replica for it to commit a transaction.?
5.3 Synchronous Streaming Replication
PostgreSQL?can be configured for synchronous streaming replication by making some replicas as a synchronous standby. In synchronous streaming architecture, the primary commits a transaction only after the replica acknowledges transaction persistence at its side.?Synchronous streaming replication ensures no data loss at the cost of higher user transaction delay/latency.?
Need More Features to be an HA System!
All the above architectures are still missing some of the basic components and features required to become a real HA environment, like?
领英推荐
Alternate PostgreSQL HA Architectures
There are some other commonly used architectures that implement the above HA features to some extent.
5.4 HA Using Patrioni?
Patroni is an open source template with tools to configure, deploy and operate a Postgres HA architecture. It has a Distributed Configuration System (DCS) and Patroni agents operating on the PostgreSQL nodes in the cluster. Patroni agent syncs up with the DCS system to notify any failures in the primary and then promotes any of the secondaries as primary.
5.5 HA Using pg_auto_failover?
Pg_auto_failover is another open-source PostgreSQL extension that configures the HA architecture by extending PostgreSQL capabilities. Each node runs a PostgreSQL extension called keeper, which manages the PostgreSQL service and a Monitor watching the health status of each of these nodes. The monitor is capable of doing the configurations, promotions, and demotions, in case of failures. It supports failure detection and auto-failover, but no query re-routing capabilities.?
6. AppZ PostgreSQL HA Stack
Based on our discussion so far, you may have found out that setting up of a highly available PostgreSQL DBMS system with Five Nines (99.999) availability is not an easy task. Based on the experience in working with PostgreSQL application deployments, management, and further research,?Cloud Control Solutions?released its latest?AppZ PostgreSQL HA Stack?in Feb 2022, addressing most of the pain points in setting up PostgreSQL DBMS in HA mode.?
AppZ?PostgreSQL?HA can support one primary and one or more secondary nodes as a?Kubernetes?cluster. Primary and secondaries are configured to have synchronous streaming replication. The application can do read/write to the primary whereas read-only from the secondaries, which are load-balanced. Pg_auto_failover (PostgreSQL?extension) is used to set up a keeper on each node and the monitor on a separate node. Monitor and keeper are in sync to detect any failures in the primary, and once a failure is detected, the monitor will promote any of the secondaries to Primary and will allow read/write.?PgConnector?is deployed as HA Proxy, which can forward the traffic to primary/secondary nodes based on the request/traffic.?
6.1 Salient Features of AppZ PostgreSQL HA Stack
?????Primary and secondaries in Synchronous streaming replication to achieve zero data loss.
?????PgMonitor to watch the primary and secondary to detect failures and establish switch over.
?????PgConnector with separate end-points for Read-Only and Read/Write Access.
?????Capable of running on AWS, Azure, GCP, or On-Prem.
?????AppZ?PostgreSQL?HA is deployed to?Kubernetes?Clusters in containers.
Interested to Know More?
Here is the link to a live demo of this game-changing stack on Kubernetes?https://lnkd.in/eD9tEXat.?This video covers a live webinar demonstrating the power of our?PostgreSQL?HA stack, which runs on any?cloud?or on-prem (Nutanix, VMWare, AWS, Azure, and Google).