登录查看更多内容

Multi-master replication solution for PostgreSQL

Huma Firdaus

Manager - Cloud Solutions at E2E Networks

发布日期: 2021年5月25日

With the growing demand for databases for several applications in the modern world, scaling them is equally important to increase the database capability to store and process more data. Achieving the same can be critical. Database scaling can be done horizontally or vertically. While vertical scaling means adding resources such as a vCPU’s, disk, or memory to the existing nodes, horizontal scaling deals with adding nodes with the existing nodes behind the load balancer. Horizontal scalability eliminates the need to add the new hardware in the existing node and cuts the related cost too.

Implementing horizontal scalability is not as easy as it appears and demands development effort. PostgreSQL is a stable database management system, known for its resilience and integrity. The qualities of this open-source relational database management system make it preferable for large-scale web, mobile, and analytical applications. Due to its added stability, PostgreSQL offers both vertical and horizontal scalability. This article explains one of the most important concepts of PostgreSQL, that is, multi-master replication, and how it is implemented practically for scaling resources.

What is Multi-master Replication?

Multi-Master Replication is a concept where data is replicated in a group of connected database servers and anyone can update the database which is replicated to other systems updated by those servers. The multi-master replication adds a bi-directionality element to the process.

In Multi-master replication, multiple nodes accept write queries, and all the nodes house the same data. The primary usage of multi-master replication is ensuring all-time availability, thus when the primary server experiences failure, standby is not required.

In the further section, we will explain the actual Multi-master Replication in PostgreSQL.

PostgreSQL Multi-master Replication

PostgreSQL is by default built to offer single-master replication and multi-master replication is not present in it. At present, some vendors offer Bidirectional replication products supporting multi-master replication in PostgreSQL. The latest PostgreSQL versions are not open-source.

Multi-master Replication is achievable using Bidirectional replication (BDR), which is explained further.

PostgreSQL Multi-master Replication Process

Multimaster replication in PostgreSQL is done in six parts and starts once the system configuration is done.

Part 1: System Configuration

This is the first part of replication where BDR installation is performed with the pglogical plugin (logical replication solution). Here, postgresql. conf and pg_hba.conf files are modified and the service is restarted explained in the following steps.

Step1: Install pglogical plugin and BDR.

Step 2: Configure postgresql.conf file using the commands:

wal_level = logical

shared_preload_libraries= ‘pg_logical, bdr’

track_commit_timestamp= ‘on’ (For resolving conflict).

Step 3: Create a user and grant it superuser privileges for managing BDR connectivity.

CREATE USER user1 WITH SUPERUSER REPLICATION PASSWORD ‘user_password’;

Step 4: Alter pg_hba.conf file using the commands

host all bdr 10.20.30.40/24 md5

host replication bdr 10.20.30.40/24 md5

Step 5: Add .pgpass file user.

hostname:port:database:_user1: user_password

Step 6: Now, restart Postgresql

The next part will explain how a single BDR node is created.

Part 2: Single BDR Node Creation

For creating the BDR node, activate BDR on db1 and the host ‘host1’ as ‘user1’.

Step 1: Create an extension.

CREATE EXTENSION bdr CASCADE; (CASCADE is used to create the extension.)

Step 2: Initialise the current node.

SELECT bdr.create_node( node_name:= ‘initail_node’, local_dsn:= ‘dbname=db1 host= host1 user=user1’);

Step 3: Create BDR cluster definition.

SELECT bdr.create_node_group(node_group_name:=’the_node_group’);

Step 4: Wait for the bdr.wait_for_join_completion function to complete.

SELECT bdr.wait_for_join_completion();

We have created the BDR cluster, now let’s create another BDR Node.

Part 3: Another BDR Node Creation

Step 1: Create a BDR extension using the command:

CREATE EXTENSION bdr CASCADE;

Step 2: Initialise the current node with the command on another host:

SELECT bdr.create_node( node_name:= ‘next_node’, local_dsn:= ‘dbname=db1 host= host2 user=user1’);

Step 3: Create the BDR cluster definition.

SELECT bdr.join_node_group(join_target_dsn:= ‘dbname=db1 host= host1 user=user1’, wait_for_completion:=True);

Part 4: HAProxy Configuration

For getting higher availability, install HAProxy software by following the steps below:

Step 1: Modify the haproxy.cfg file global section.

global

state socket /var/run/haproxy/sock level admin (For sending HAProxy commands and retrieving information)

Step 2: Edit the haproxy.cfg file’s bk_db.

stick-table type ip size 1

stick on dst

server bdr_initial_node host1: 5432 check

server bdr_next_node my_host2: 5432 backup check

Step 3: Reload the configuration file.

sudo systemctl reload haproxy

Part 5: Merge PgBouncer to HAProxy

PgBouncer pools PostgreSQL connection with which PostgreSQL interacts with the clients. HAProxy redirects traffic and PgBouncer completes the transactions. PgBouncer is using the below steps.

Step 1: In ft_postgresql, modify the haproxy.cfg file.

frontend ft_postgresql

bind *: 5433

default_backend bk_db

Step 2: Change the databases section in the pgbouncer.ini file

* = host= proxy_server port= 5433

For modifying the pgbouncer section, use:

listen_port = 5432

Step 3: Restart HAProxy and PgBouncer.

Part 6: Switching over the Node

Lastly, we will be moving away from the connections from the initial node by making some changes and reinitialising the server.

Step 1: Disable HAProxy initial_node.

echo “disable server bk_db/bdr_initial_node” | socat /var/run/haproxy/sock –

Step 2: Reconnect PgBouncer should reconnect and wait for connections to be re-established.

psql –h proxy_server –U pgbouncer pgbouncer –c “For reconnection”

psql –h proxy_server –U pgbouncer pgbouncer –c “For waiting”

Step 3: Manage the initial node before re-enabling.

Step 4: Re-enable the initial node.

echo “enable server bk_db/bdr_initial_node” | socat /var/run/haproxy/sock –

Conclusion:

Here, we conclude the multi-master replication process in PostgreSQL, which goes through several stages: installation, upgradation, and testing. We have tried to explain how the process is accomplished using simple steps. We hope it proves to be useful for you.

Signup here for more details: https://bit.ly/3mFerJn

Sheikh Wasiu Al Hasib

DevOps Engineer | AI/ML, Bash, Python, Golang, Docker , MSSQL, PostgreSQL, MongoDB, MySQL, Zabbix, Promethuse, Grafana, Ansible, CKA,CKS

2 年

does BDR support PostgreSQL14 as opensource? As per my observation opensource BDR only support till PostgreSQL 9.6.

要查看或添加评论，请登录

Huma Firdaus的更多文章

7 Regression Techniques you should know!

2022年11月15日

7 Regression Techniques you should know!

Introduction Regression analysis is a type of predictive modelling technique that looks at how an independent variable…
Using Natural Language Processing to understand and identify risks

2022年10月18日

Using Natural Language Processing to understand and identify risks

Through the analysis of massive volumes of data, machine learning and AI are poised to revolutionize the business and…
New Product Alert!! C3/M3 series on E2E Cloud

2022年9月20日

New Product Alert!! C3/M3 series on E2E Cloud

Today, we announce the General Availability of our latest compute and memory series plans, the C3 and M3 instances. We…
Project Management for AI-ML-DL Projects

2022年7月18日

Project Management for AI-ML-DL Projects

Managing a project properly is one of the factors behind its completion and subsequent success. The same can be said…
Apache Spark 3.0 for Data Scientists : Best Practices

2022年7月18日

Apache Spark 3.0 for Data Scientists : Best Practices

Apache Spark has become a fool-proof standard framework for data processing. It is enabling organisations to process…
Top 10 Tools for data scientists in 2022

2022年6月22日

Top 10 Tools for data scientists in 2022

In 2022, the most enticing professional path is data science. And data is the gold that corporations mine every day to…
What is CUDA GL Container Available in the NVIDIA GPU Cloud?

2022年6月4日

What is CUDA GL Container Available in the NVIDIA GPU Cloud?

NVIDIA-developed CUDA is a parallel computing platform and programming model for the general and basic computing on…
Use of Artificial Intelligence (AI) in Cybersecurity

2022年6月4日

Use of Artificial Intelligence (AI) in Cybersecurity

The cyber attack surface in today’s business systems is enormous, and it’s just becoming bigger. As a result…
Role of Artificial Intelligence in Claims Litigation

2022年6月4日

Role of Artificial Intelligence in Claims Litigation

As artificial intelligence (AI) continues to disrupt many sectors, including the legal services industry, experts…
What is CUDA GL Container Available in the NVIDIA GPU Cloud?

2022年5月18日

What is CUDA GL Container Available in the NVIDIA GPU Cloud?

NVIDIA-developed CUDA is a parallel computing platform and programming model for the general and basic computing on…

See all articles

Multi-master replication solution for PostgreSQL

Huma Firdaus

Manager - Cloud Solutions at E2E Networks

What is Multi-master Replication?

In the further section, we will explain the actual Multi-master Replication in PostgreSQL.

PostgreSQL Multi-master Replication

PostgreSQL Multi-master Replication Process

Part 1: System Configuration

Part 2: Single BDR Node Creation

Part 3: Another BDR Node Creation

Part 4: HAProxy Configuration

Part 5: Merge PgBouncer to HAProxy

Part 6: Switching over the Node

Conclusion:

Huma Firdaus的更多文章

社区洞察

其他会员也浏览了

Using pgAdmin for Effective PostgreSQL Database Management

Monitoring PostgreSQL Performance: Tools and Techniques

Understanding PostgreSQL Configuration Parameters for Performance Tuning

Navigating the Database Dilemma: Postgres vs. MySQL

Achieving High Availability in PostgreSQL

Understanding Incremental Backup in PostgreSQL

Creating a highly available and fault-tolerant Postgresql database cluster

Transitioning from Oracle to PostgreSQL: Tablespaces

No Vacuum, No Bloat, No Downtime on Failover, No Lock Escalation, No Manual Sharding, No Delays in Cloning or Backup, No Outage for Database Upgrades

Monitor KubeDB Managed PostgreSQL With Datadog in Google Kubernetes Engine (GKE)

What is Multi-master Replication?

In the further section, we will explain the actual Multi-master Replication in PostgreSQL.

PostgreSQL Multi-master Replication

PostgreSQL Multi-master Replication Process

Part 1: System Configuration

Part 2: Single BDR Node Creation

Part 3: Another BDR Node Creation

Part 4: HAProxy Configuration

Part 5: Merge PgBouncer to HAProxy

Part 6: Switching over the Node

Conclusion:

Huma Firdaus的更多文章

7 Regression Techniques you should know!

Using Natural Language Processing to understand and identify risks

New Product Alert!! C3/M3 series on E2E Cloud

Project Management for AI-ML-DL Projects

Apache Spark 3.0 for Data Scientists : Best Practices

Top 10 Tools for data scientists in 2022

What is CUDA GL Container Available in the NVIDIA GPU Cloud?

Use of Artificial Intelligence (AI) in Cybersecurity

Role of Artificial Intelligence in Claims Litigation

What is CUDA GL Container Available in the NVIDIA GPU Cloud?

社区洞察

其他会员也浏览了

Using pgAdmin for Effective PostgreSQL Database Management

Monitoring PostgreSQL Performance: Tools and Techniques

Understanding PostgreSQL Configuration Parameters for Performance Tuning

Navigating the Database Dilemma: Postgres vs. MySQL

Achieving High Availability in PostgreSQL

Understanding Incremental Backup in PostgreSQL

Creating a highly available and fault-tolerant Postgresql database cluster

Transitioning from Oracle to PostgreSQL: Tablespaces

No Vacuum, No Bloat, No Downtime on Failover, No Lock Escalation, No Manual Sharding, No Delays in Cloning or Backup, No Outage for Database Upgrades

Monitor KubeDB Managed PostgreSQL With Datadog in Google Kubernetes Engine (GKE)