登录查看更多内容

Exploring the Heart of Real-Time Messaging at Slack ??

Ankit Shaw

Engineering @ Nomura | Ex - Morgan Stanley | Java | Python | Distributed Systems

发布日期: 2023年10月9日

In this article we will explore and understand Slack's architecture that is used to send real-time messages at scale. We'll take a look at the services that deliver chat messages and various other events sent to online users in real time at Slack.

The core services in the system are written in Java, including:

Channel Servers
Gateway Servers
Admin Servers
Presence Servers.

Channel Server

Slack uses Channel Servers (CS) to hold message history of channels and each of such CS are mapped to a subset of channels using consistent hashing.

During peak times, they handle about 16 million channels per host, ensuring that messages reach their intended destinations without delay. Every CS host is responsible for receiving and sending messages for the channels they are mapped to, making the entire system highly efficient.

Gateway Server

Gateway Servers are stateful and in-memory, and serve as the interface between Slack clients and CSs. They are deployed across multiple geographical regions to ensure fast connections, and a draining mechanism handles region failures, which seamlessly switches users to a healthy region.

Admin Server

Admin Server (AS) are stateless and in-memory, facilitating communication between the Webapp backend and CSs.

Presence Servers

Presence Server (PS) are in-memory, track online user status, powering those familiar green presence dots in Slack clients. Users make queries to PS through websockets using GS as a proxy, ensuring that presence notifications are delivered only for the users currently visible on the app screen.

Slack Client

Slack Client establishes a persistent websocket connection to Slack's servers to receive real-time events and maintain the client's state.

When a Slack client boots up, it fetches the user token and the WebSocket connection setup details from the Webapp backend.
With the above information the Slack client initiates a WebSocket connection to the nearest edge region. This ensures low-latency communication.
The request from the Slack client is forwarded to the Gateway Server (GS) through an edge and service proxy.
GS, upon receiving the request, retrieves the user's information from the Webapp backend, which includes details about all the channels the user is a part of.
After obtaining the user's information, GS sends the first message to the Slack client. This establishes the initial connection and prepares the client for real-time communication.
Finally GS subscribes to all the channel servers (CS) asynchronously. After this the Client is ready to send and receive real-time messages.

领英推荐

When your puzzle has a few broken peices

Derek Fisher 9 个月前

Reverse Proxy VS. API Gateway

Ahmed Safwat 3 个月前

Deploying a Confluence Server in a Podman Pod Using…

Tom Dean 2 年前

Sending a message to clients in real-time

Once the client is ready, each message sent in a channel is broadcasted to all clients online in the channel. Let us see how the flow of message happens.

The client hits the Webapp API to send a message.
Webapp sends that message to Admin Server.
Based on channel ID in this message, Admin Server discovers Channel Server (CS), and routes the message to the appropriate CS that hosts the real time messaging for this channel.
When CS receives the message for that channel, it sends out the message to every GS across the world that is subscribed to that channel.
Each GS that receives that message sends it to every connected client subscribed to that channel id.
This is how Slack is able to deliver messages across the world in under 500ms.

Events

Events are special messages, real-time updates that affect the client's state. These events undergo a similar journey as chat messages, keeping the Slack experience dynamic and interactive. Some example of events are, when a user sends a reaction to a message, a bookmark is added, or a member joins a channel.

There are Transient events as well. These events are not persisted in the database. Example of such a events is user typing in a channel.

Conclusion

Slack's real-time messaging system efficiently handles tens of millions of channels per host and serves a similar number of connected clients. This system ensures messages are delivered globally within a mere 500 milliseconds. Moreover, the current infrastructure is designed for linear scalability, meaning it can easily accommodate even more customers in the future.

#scalability #systemdesign #softwareengineering

Meet Mehta

1 年

Great article Ankit Shaw!! Thank you for sharing!!

2 次回应

查看更多评论

要查看或添加评论，请登录

Ankit Shaw的更多文章

Discord's journey from MongoDB to Cassandra

2023年9月12日

Discord's journey from MongoDB to Cassandra

Discord during its early days used MongoDB to store the messages. Initially, messages were stored with a compound index…

1 条评论

Exploring the Heart of Real-Time Messaging at Slack ??

Ankit Shaw

Engineering @ Nomura | Ex - Morgan Stanley | Java | Python | Distributed Systems

Channel Server

Gateway Server

Admin Server

Presence Servers

Slack Client

领英推荐

Sending a message to clients in real-time

Events

Conclusion

Ankit Shaw的更多文章

社区洞察

其他会员也浏览了

Deploying a Confluence Server in a Podman Pod Using Containers

Newsletter #23 Microsoft No Code / Low Code

All About Identity Server: A Comprehensive Guide for .NET Developers

Making Sense of Microsoft Dev Box Customization

Announcing Baton, an Open Source Toolkit for Auditing Infrastructure User Access

Gmail from Old Dog Terminal

Salesforce CORS, CSP & ConnectedApp save the day!

Resmo January Newsletter: Product Updates and Community Highlights

OneDrive Integration with React: Step-by-Step Guide

Channel Server

Gateway Server

Admin Server

Presence Servers

Slack Client

领英推荐

Sending a message to clients in real-time

Events

Conclusion

Ankit Shaw的更多文章

Discord's journey from MongoDB to Cassandra

社区洞察

其他会员也浏览了

Deploying a Confluence Server in a Podman Pod Using Containers

Newsletter #23 Microsoft No Code / Low Code

All About Identity Server: A Comprehensive Guide for .NET Developers

Making Sense of Microsoft Dev Box Customization

Announcing Baton, an Open Source Toolkit for Auditing Infrastructure User Access

Gmail from Old Dog Terminal

Salesforce CORS, CSP & ConnectedApp save the day!

Resmo January Newsletter: Product Updates and Community Highlights

OneDrive Integration with React: Step-by-Step Guide