Sharding and Replication in PostgreSQL with Spring Boot

For high throughput, you need Sharding (Horizontal Scaling) and Replication (Read Scaling) in PostgreSQL using Spring Boot.

1?? Setting Up PostgreSQL Sharding in Spring Boot

Sharding is the process of distributing data across multiple databases to handle large volumes. We use?Citus?(a PostgreSQL extension) for automatic sharding.

?? Step 1: Install Citus on PostgreSQL

Run the following in the terminal:

sudo apt install postgresql-15-citus        

?? Step 2: Configure Citus Coordinator

Login to PostgreSQL and enable Citus:

CREATE EXTENSION citus;        

Add worker nodes:

SELECT * FROM master_add_node('worker-node-1-ip', 5432);
SELECT * FROM master_add_node('worker-node-2-ip', 5432);        

?? Step 3: Create a Sharded Table

In your Spring Boot database, create a distributed table:

CREATE TABLE orders (
    order_id BIGINT PRIMARY KEY,
    customer_id BIGINT NOT NULL,
    amount NUMERIC NOT NULL
);        

Now, distribute the table across shards:

SELECT create_distributed_table('orders', 'customer_id');        

2?? Setting Up Replication (Read Scaling)

PostgreSQL supports Streaming Replication to distribute read requests to replica nodes.

?? Step 1: Configure Primary (Master) Database

Edit postgresql.conf:

wal_level = replica
max_wal_senders = 10
hot_standby = on        

Edit pg_hba.conf to allow replica connection:

host replication all <REPLICA_IP> md5        

Restart PostgreSQL:

sudo systemctl restart postgresql        

?? Step 2: Configure Standby (Replica) Database

Stop PostgreSQL:

sudo systemctl stop postgresql        

Copy data:

pg_basebackup -h <MASTER_IP> -D /var/lib/postgresql/15/main -U postgres -P -R        

Start replica:

sudo systemctl start postgresql        

3?? Spring Boot Configuration for Sharding & Replication

Now, configure Spring Boot to use sharding (writes to master) and replication (reads from replicas).

?? Step 1: Configure application.yml

spring:
  datasource:
    hikari:
      master:
        jdbc-url: jdbc:postgresql://master-db-ip:5432/mydb
        username: postgres
        password: password
      replica:
        jdbc-url: jdbc:postgresql://replica-db-ip:5432/mydb
        username: postgres
        password: password        

?? Step 2: Create a Custom DataSource

@Configuration
public class DataSourceConfig {

    @Bean
    @Primary
    public DataSource dataSource() {
        AbstractRoutingDataSource routingDataSource = new ReplicationRoutingDataSource();
        
        Map<Object, Object> dataSources = new HashMap<>();
        dataSources.put(DatabaseType.MASTER, masterDataSource());
        dataSources.put(DatabaseType.REPLICA, replicaDataSource());

        routingDataSource.setTargetDataSources(dataSources);
        routingDataSource.setDefaultTargetDataSource(masterDataSource());
        return routingDataSource;
    }

    @Bean
    public DataSource masterDataSource() {
        return DataSourceBuilder.create()
                .url("jdbc:postgresql://master-db-ip:5432/mydb")
                .username("postgres")
                .password("password")
                .build();
    }

    @Bean
    public DataSource replicaDataSource() {
        return DataSourceBuilder.create()
                .url("jdbc:postgresql://replica-db-ip:5432/mydb")
                .username("postgres")
                .password("password")
                .build();
    }
}        

?? Step 3: Use Annotations to Control Reads/Writes

Use @Transactional(readOnly = true) for read queries:

@Service
public class OrderService {

    @Transactional
    public void createOrder(Order order) {
        orderRepository.save(order); // Writes go to Master
    }

    @Transactional(readOnly = true)
    public List<Order> getOrders(Long customerId) {
        return orderRepository.findByCustomerId(customerId); // Reads go to Replica
    }
}        

?? Conclusion

? Sharding (Citus) → Scales write operations across multiple databases

? Replication (Streaming Replication) → Offloads read queries to replicas

? Spring Boot Config → Uses Routing DataSource for automatic master-replica switching

Sagar Kulung

Team Lead at BeyondID

1 个月

Nice Article! CQRS is a great design pattern if you can spin up couple extra resources for the performance boost.

要查看或添加评论,请登录

Sanjay Shah的更多文章

社区洞察

其他会员也浏览了