登录查看更多内容

Design a Distributed Priority Queue

Momen Negm

Chief Technology Officer @ T-Vencubator | Data Scientist, Generative AI | Tech entrepreneur - Engineering leader

发布日期: 2024年5月5日

Today, we'll develop a Distributed Priority Queue, focusing on its implementation using a sharded SQL database. Our discussion will include:

API and message schema
Adding tasks to the queue (Enqueue)
Removing tasks from the queue (Dequeue)
Deploying across specific regions (Regional Deployment)
Expanding the deployment to multiple regions (Multi-regional)

API, Data Structure for Items, and Status of Messages

The message schema outlines the data format that will be entered into and retrieved from the queue.

Namespace: This represents the isolation boundary for different tenant environments.

Topic: Acts as a logical queue; a single namespace can have numerous topics.

Priority: 32-bit integer where a lesser value denotes higher message urgency.

Payload: An immutable binary data block, limited by size.

Metadata: Mutable key-value data intended for ancillary information

Deliver After: The scheduled time for a message to be made available for consumption

Lease Duration: The allocated time frame within which a consumer must acknowledge (ack) or reject (nack) a dequeued message.

Unique ID: A unique identifier.

TTL: Defines the duration a message remains in the queue before automatic deletion

Enqueue operation

Enqueue is an operation to push a message to a Priority Queue. If an enqueue succeeds, the item is persisted and can eventually be dequeued.

Submit to Enqueue Buffer

The serialized item is sent to the Priority Queue system, where it first lands in an enqueue buffer. This buffer acts as a preliminary holding area to manage incoming requests and mitigate spikes in traffic.
The enqueue buffer ensures rate limiting and initial validation, preventing system overload and filtering out malformed requests before they reach the core processing units (enqueue workers).

领英推荐

Advanced SQL Aggregation Methods

Amr Saafan 6 个月前

Understanding the Power of DISTINCT in SQL Server…

Amr Saafan 1 年前

Counting Distinct Values in SQL: Tips and Examples

StrataScratch 1 个月前

Process with Enqueue Workers

Enqueue workers continuously poll the enqueue buffer for new items. Once an item is picked up by a worker, it undergoes further validation, such as checking for correct formatting and ensuring priority levels are within acceptable ranges.
These workers are responsible for the logical handling of items, including determining the appropriate SQL shard based on the item's attributes.

Distribute to SQL Shards

After processing, the item is assigned to a specific SQL shard. The distribution can be based on various factors like load balancing, shard health, and data partitioning strategies
Each SQL shard corresponds to a segment of the overall data structure, designed to scale horizontally and handle high volumes of writes. The enqueue worker inserts the item into the designated shard

Acknowledge and Log

Once the item has been successfully inserted into the appropriate SQL shard, the system generates a unique identifier for the item, combining the shard ID with a unique item identifier within that shard

Dequeue operation

The dequeue API accepts a collection of (topic, count) pairs. For each topic requested, host will return, at most, count items for that topic. The items are ordered by deliver_after and priority, so items with current deliver_after and lower priority will be delivered first. If multiple items are tied for lowest priority, lower deliver_after items will be delivered first.?

Dequeueing items typically involves the following steps:

Request Dequeue: A dequeue request is initiated by a client or a service. This request specifies the number of items to be dequeued and may include priority or other criteria to select the appropriate items from the queue.
Select Items: The Queue evaluates the priority queue to identify the highest-priority items available for dequeuing. This involves sorting or retrieving items based on their assigned priorities.
Lock Items: To prevent race conditions or duplicate processing, the system temporarily locks the selected items. This ensures that no other process can dequeue or modify these items while they are being processed for removal.
Remove Items and Update Queue: Once the items are locked and confirmed for dequeuing, they are removed from the priority queue. The Queue system updates indexes, counters, an other metadata in the corresponding shard
Return Items and Unlock: The dequeued items are packaged and sent back to the client. After the items have been successfully returned, locks are released.

Regional Deployment

During any event leading to the unavailability of a primary replica in Region X, SQL DB can elect secondary replicas from Region 2 as the new primary. After failover, the queue service in Region 1 must send queries to the new primary database in Region 2 where cross-region latencies increase up to hundreds of milliseconds.

In the event of complete network connectivity loss in region 1, the limitations of the regional architecture become more glaring:?

Customers have to explicitly balance their traffic away from an impacted region. Regional installations force all of the disaster recovery and load-balancing complexity to clients.
Pushing load-balancing complexity to customers often results in underutilization of system resources. Additionally, it requires buffer capacity in other regions to accommodate shifted traffic.
Due to an inability for SQL primaries to be picked up outside of a given regional installation, items stored on replicas in the primary regions are susceptible to being “stuck” until connectivity is restored. Regional installations fail to offer access to data with high availability.

Multi Regional Deployment

To enhance latency, we introduce a routing service acting as an intermediary between clients and the queue service. This service conceals the complexities of physical routing from clients and facilitates efficient distribution of messages

Routing service optimizes data allocation according to regional preferences. It uses in-memory mapping to associate logical regional preferences with the nearest SQL shards. The API allows clients to indicate their preferred storage regions.

When processing a request, Routing Service references its in-memory data to identify suitable queue nodes based on the client's regional choice. Consequently, items are stored in the designated physical location or an alternative region overseeing the relevant SQL shards in case of failover. If there's no region preference, items default to the nearest region from where the request originated.

要查看或添加评论，请登录

Momen Negm的更多文章

Key Concepts in Databases: Keys, Joins, Query Optimization, and Normalization

2024年11月10日

Key Concepts in Databases: Keys, Joins, Query Optimization, and Normalization

Save this article, as it covers the core concepts of databases: types of keys, types of joins, query optimization, and…
Database Indexing Essentials in System Design

2024年11月3日

Database Indexing Essentials in System Design

Database Indexing Essentials for System Designers: BTree, Hash, Bitmap, Full text indexing technique deep dive?? ??…

1 条评论
Essential Guidelines for Effective System Design

2024年10月7日

Essential Guidelines for Effective System Design

Creating a robust and scalable system requires thorough planning and attention to multiple factors. Here are some key…
High-Level vs. Low-Level Design: Choosing the Right Approach for System Architecture

2024年9月4日

High-Level vs. Low-Level Design: Choosing the Right Approach for System Architecture

What is a High-Level Design (HLD)? A High-Level Design (HLD) is a crucial document that maps out the overarching…
Designing a Stock Exchange: High-Level Design, Data Model, and Reliability Considerations

2024年7月2日

Designing a Stock Exchange: High-Level Design, Data Model, and Reliability Considerations

At the core of any stock exchange system is its capability to efficiently process and manage orders. An order in this…
Architecture of the School Bus Tracking System

2024年6月26日

Architecture of the School Bus Tracking System

In this article, we will explore the process of building a vehicle tracking system, with a specific focus on school…

2 条评论
Develop a Programming Competition Platform Similar to Leetcode

2024年5月28日

Develop a Programming Competition Platform Similar to Leetcode

The system should efficiently handle and distribute coding challenges with minimal latency, manage simultaneous user…
Youtube Design - Video Publishing and Delivery

2024年4月23日

Youtube Design - Video Publishing and Delivery

"Publishing videos on YouTube is a multifaceted procedure encompassing several stages. This discussion will address:…
Create a Live Video Broadcasting Service

2024年4月16日

Create a Live Video Broadcasting Service

Today, we'll be creating a Live Video Streaming Platform and discussing the key elements: Real-time video ingestion…
System Design: The Principle of Consistent Hashing

2024年3月20日

System Design: The Principle of Consistent Hashing

In today's era, where vast amounts of data are generated daily, it's increasingly challenging for large organizations…

1 条评论

See all articles

Design a Distributed Priority Queue

Momen Negm

Chief Technology Officer @ T-Vencubator | Data Scientist, Generative AI | Tech entrepreneur - Engineering leader

Enqueue operation

领英推荐

Dequeue operation

Regional Deployment

Multi Regional Deployment

Momen Negm的更多文章

社区洞察

其他会员也浏览了

Maximizing SQL Power: A Journey to Achieving 20x Faster Query Speeds

What Are the Steps to Cast INT in SQL for Type Conversion?

How to find duplicates in a table using SQL?

Different types of SQL Commands

SQL Query Optimization: Key Techniques for Boosting Performance at Both the Query and Source Level

SQL QuickStart Guide

Apache Arrow Flight SQL: Revolutionizing Data Transfer ( Flight vs JDBC/ODBC): 4.49x Faster with benchmark and code

Query optimization

Understanding the Execution Cycle of an SQL Query: A Key to Optimizing Performance

Enqueue operation

领英推荐

Dequeue operation

Regional Deployment

Multi Regional Deployment

Momen Negm的更多文章

Key Concepts in Databases: Keys, Joins, Query Optimization, and Normalization

Database Indexing Essentials in System Design

Essential Guidelines for Effective System Design

High-Level vs. Low-Level Design: Choosing the Right Approach for System Architecture

Designing a Stock Exchange: High-Level Design, Data Model, and Reliability Considerations

Architecture of the School Bus Tracking System

Develop a Programming Competition Platform Similar to Leetcode

Youtube Design - Video Publishing and Delivery

Create a Live Video Broadcasting Service

System Design: The Principle of Consistent Hashing

社区洞察

其他会员也浏览了

Maximizing SQL Power: A Journey to Achieving 20x Faster Query Speeds

What Are the Steps to Cast INT in SQL for Type Conversion?

How to find duplicates in a table using SQL?

Different types of SQL Commands

SQL Query Optimization: Key Techniques for Boosting Performance at Both the Query and Source Level

SQL QuickStart Guide

Apache Arrow Flight SQL: Revolutionizing Data Transfer ( Flight vs JDBC/ODBC): 4.49x Faster with benchmark and code

Query optimization

Understanding the Execution Cycle of an SQL Query: A Key to Optimizing Performance