登录查看更多内容

Sharding

Rajeev Singh

.NET Architect | Delivering Cutting-Edge Software Architecture | Cloud, Microservices, and Enterprise Solutions

发布日期: 2025年1月3日

As you know, any system backbone is data and data grows everyday. According to statistics, by year 2028, the projected worldwide volume of the data can grow upto 394 Zettabytes (reg. see statistics). In simple language, you almost need around 98,500 DVDs to store that many of data.

The reason to giving above information to make you understand, how data is growing and what you can do to handle if you have large volume of data to process through your application. I've collected some good information that explains, how you can achieve scaling in your application; and one of the popular concept I’m going to discuss here called Sharding.

Sharding

Database sharding is a horizontal partitioning strategy where a large database is divided into smaller, more manageable pieces called shards. Each shard is an independent database that contains a subset of the data, allowing the overall system to scale by distributing the load across multiple servers.

Sharding is important when you want to achieve scalability due to large volume of data, improve the performance, manageable and cost effective.

Sharding Strategies

1. Range-Based Sharding:

Data is divided into shards based on ranges of the sharding key (e.g., user IDs 1–1000 go to shard A, 1001–2000 go to shard B).
Pros: Easy to implement and understand.
Cons: Uneven data distribution may lead to hotspots.

2. Hash-Based Sharding:

The sharding key is hashed, and the hash value determines the shard.
Pros: Provides uniform data distribution.
Cons: Harder to re-shard when adding new shards.

3. Geographic Sharding:

Data is divided based on geographical regions.
Pros: Useful for applications with distinct regional usage patterns.
Cons: Inefficient if data is accessed globally.

4. Directory-Based Sharding:

A directory keeps track of which shard contains which data.
Pros: Flexible and dynamic.
Cons: Directory maintenance adds complexity.

Setting up database sharding locally using SQL Server involves creating multiple databases (shards) and distributing the data among them. SQL Server doesn't have built-in sharding support like some other databases, so you'll have to implement sharding logic yourself. Here's a step-by-step guide:

领英推荐

Datetime vs Timestamp datatype in databases - Which…

Arpit Bhayani 1 年前

Data Structures powering our Database Part-3 | B-Trees

Saurav Prateek 2 年前

Weekly Newsletter: SQL Year-End Auditing Support…

IDERA Software 5 个月前

Step 1: Install SQL Server

Download and Install SQL Server:

Download the installer from the official Microsoft website.
Follow the installation instructions to install SQL Server on your local machine.

Install SQL Server Management Studio (SSMS):

Download SSMS from the official Microsoft website.
Follow the installation instructions to install SSMS.

Step 2: Create Shard Databases

Open SQL Server Management Studio (SSMS).
Connect to Your SQL Server Instance.
Create Multiple Databases (shards). For example, create three databases:

CREATE DATABASE Shard1;
CREATE DATABASE Shard2;
CREATE DATABASE Shard3;

Step 3: Create Schema and Tables in Each Shard

USE Shard1;
CREATE TABLE Customers (
    CustomerId INT PRIMARY KEY,
    Name NVARCHAR(100),
    Email NVARCHAR(100)
);

USE Shard2;
CREATE TABLE Customers (
    CustomerId INT PRIMARY KEY,
    Name NVARCHAR(100),
    Email NVARCHAR(100)
);

USE Shard3;
CREATE TABLE Customers (
    CustomerId INT PRIMARY KEY,
    Name NVARCHAR(100),
    Email NVARCHAR(100)
);

Step 4: Distribute Data Across Shards

Implement a sharding strategy to distribute data across the shards. Common strategies include range-based sharding, hash-based sharding, and list-based sharding.

Example: Range-Based Sharding:

Distribute customers based on CustomerId.

-- Insert into Shard1
USE Shard1;
INSERT INTO Customers (CustomerId, Name, Email)
VALUES (1, 'John Doe', '[email protected]');

-- Insert into Shard2
USE Shard2;
INSERT INTO Customers (CustomerId, Name, Email)
VALUES (101, 'Jane Smith', '[email protected]');

-- Insert into Shard3
USE Shard3;
INSERT INTO Customers (CustomerId, Name, Email)
VALUES (201, 'Mike Johnson', '[email protected]');

Step 5: Implement Routing Logic in Your Application

Update your application to route queries to the appropriate shard based on the sharding strategy. Here's a basic example in C#:

using System;
using System.Data.SqlClient;

public class ShardingExample
{
    private static string GetConnectionString(int customerId)
    {
        if (customerId <= 100)
            return "Server=localhost;Database=Shard1;Integrated Security=True;";
        else if (customerId <= 200)
            return "Server=localhost;Database=Shard2;Integrated Security=True;";
        else
            return "Server=localhost;Database=Shard3;Integrated Security=True;";
    }

    public static void InsertCustomer(int customerId, string name, string email)
    {
        var connectionString = GetConnectionString(customerId);
        using (var connection = new SqlConnection(connectionString))
        {
            connection.Open();
            var command = new SqlCommand("INSERT INTO Customers (CustomerId, Name, Email) VALUES (@CustomerId, @Name, @Email)", connection);
            command.Parameters.AddWithValue("@CustomerId", customerId);
            command.Parameters.AddWithValue("@Name", name);
            command.Parameters.AddWithValue("@Email", email);
            command.ExecuteNonQuery();
        }
    }

    public static void Main(string[] args)
    {
        InsertCustomer(1, "John Doe", "[email protected]");
        InsertCustomer(101, "Jane Smith", "[email protected]");
        InsertCustomer(201, "Mike Johnson", "[email protected]");
    }
}

Step 6: Test Your Sharding Setup

Insert Test Data: Run your application to insert test data into the appropriate shards.
Query Data: Verify that data is correctly distributed across the shards.
Check Performance: Measure query performance to ensure that sharding improves scalability and performance.

Conclusion

Setting up database sharding locally using SQL Server involves creating multiple databases to act as shards, implementing a sharding strategy, and updating your application to route queries to the appropriate shard. This approach helps distribute the load and improves performance and scalability.

要查看或添加评论，请登录

Rajeev Singh的更多文章

Your local AI Chat

2025年3月14日

Your local AI Chat

Are you concern about your data security and also wanted to use power of ChatGPT kind of great tool but fearing to give…
Master-Slave Databases

2025年2月1日

Master-Slave Databases

When data grows in your application, the performance deteriorates and that's true. You can't stop data from developing…
How Well Do You Define Your APIs?

2024年12月26日

How Well Do You Define Your APIs?

API stands for Application Programming Interface, but have you ever wondered how the concept originated? Let’s take a…
HTTP Protocol

2024年12月17日

HTTP Protocol

What is HTTP and how it works? Http developed by Tim Berners-Lee when he was in CERN year 1989-1991. Everyone has…

1 条评论
Change Data Capture (CDC)

2024年8月4日

Change Data Capture (CDC)

Change data capture utilizes the SQL Server Agent to log insertions, updates, and deletions occurring in a table. So…
On-Prem Database Deployment (DevOps)

2024年7月28日

On-Prem Database Deployment (DevOps)

Introduction If you are a developer and in the tech world, you must have heard about databases. This is the heart of…

1 条评论
Docker - Easy Way

2024年7月21日

Docker - Easy Way

Definition Docker is a powerful tool that provides a platform to package solutions for deployment. We can think like a…

1 条评论

See all articles

Sharding

Rajeev Singh

.NET Architect | Delivering Cutting-Edge Software Architecture | Cloud, Microservices, and Enterprise Solutions

Sharding

Sharding Strategies

领英推荐

Step 1: Install SQL Server

Step 2: Create Shard Databases

Step 3: Create Schema and Tables in Each Shard

Step 4: Distribute Data Across Shards

Step 5: Implement Routing Logic in Your Application

Step 6: Test Your Sharding Setup

Conclusion

Rajeev Singh的更多文章

社区洞察

其他会员也浏览了

Difference between local and global indexes in DynamoDB

Design a Distributed Priority Queue

Learn PostgreSQL Data Types in Detail

TYPES OF TABLE IN SQL SERVER

AWR REPORTS VS ASH REPORTS

Simplifying time variance in a SQL data warehouse

Mastering DynamoDB Denormalization Techniques

Start Moving Data Within the Hour

Understanding SQL Server Indexes: Types and How They Impact Performance

"Key Concepts, to Master Window Functions"

Sharding

Sharding Strategies

领英推荐

Step 1: Install SQL Server

Step 2: Create Shard Databases

Step 3: Create Schema and Tables in Each Shard

Step 4: Distribute Data Across Shards

Step 5: Implement Routing Logic in Your Application

Step 6: Test Your Sharding Setup

Conclusion

Rajeev Singh的更多文章

Your local AI Chat

Master-Slave Databases

How Well Do You Define Your APIs?

HTTP Protocol

Change Data Capture (CDC)

On-Prem Database Deployment (DevOps)

Docker - Easy Way

社区洞察

其他会员也浏览了

Difference between local and global indexes in DynamoDB

Design a Distributed Priority Queue

Learn PostgreSQL Data Types in Detail

TYPES OF TABLE IN SQL SERVER

AWR REPORTS VS ASH REPORTS

Simplifying time variance in a SQL data warehouse

Mastering DynamoDB Denormalization Techniques

Start Moving Data Within the Hour

Understanding SQL Server Indexes: Types and How They Impact Performance

"Key Concepts, to Master Window Functions"