ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

Load Balancing - Discussing Statefulness and need for Consistent Hashing

Saurav Prateek

Engineer @ Google | Ex-SWE @ GeeksForGeeks | Authoring engineering newsletter with 30K+ Subs | 60K+ Linkedin | Content Creator | Mentor

å‘å¸ƒæ—¥æœŸ: 2022å¹´2æœˆ23æ—¥

Introduction

Load Balancing is one of the most widely used topics when we design systems. Almost every system which deals with a considerable number of clients/requests will need Load balancers at some point of time.

Suppose you built a service which you want others to use. You have also set up a payment model which will allow your clients to pay you as they use your service. Since you have just launched your service and there are not many people aware about it, you might be having a handful of clients willing to use your service. So, initially with a small number of clients you can directly set up your service with a single server without thinking about Load Balancing. The initial architecture may look like this.

But, with time your service will start getting popular and people will come to know about it. You have made a flawless service and your clients liked it so much that they are going to tell others as well. Soon you will be having a lot of Clients willing to use your service and pay you for that.

Now you have a lot of requests coming in and you realise that one system wonâ€™t be able to handle this amount of load. With the money you received from the clients , you decide to buy multiple servers in order to handle the increasing load. With multiple servers with yourself now you need something to distribute the incoming clientâ€™s requests to these servers evenly. This process of distributing the incoming requests evenly to the multiple machines is known as Load Balancing. The component which performs this is known as Load Balancer.

With Load Balancers in the picture, now your architecture will look like this.

Health Checks by Load Balancers

Load Balancers are also widely used to perform Health Checks of the backend servers in order to know their availability. In this scenario the load balancers ping the backend servers and the servers reply to them with their States, whether they are available to take requests or not.

Types of Load Balancers

We will discuss two types of Load Balancers which have been around for quite some time and are popular as well.

L4 Load Balancers or Network Load Balancers: L4 Load Balancer routes the request on the basis of the address information of the incoming requests. It does not inspect the content of the request. The Layer 4 (L4) Load Balancer makes the routing decisions based on address information extracted from the first few packets in the TCP stream.
L7 Load Balancers or Application Load Balancers: L7 Load Balancer routes the request on the basis of the packet content. There can be dedicated servers which could serve the requests based on their content, like URLs, Images, Graphics and Video contents. You can set up the system in such a way that the static contents are served by one dedicated server and requests demanding certain information which needs a db call can be served by another dedicated server.

é¢†è‹±æŽ¨è

Configure and work with Zabbix Monitoring tool version 6 in Docker Containers

Configure and work with Zabbix Monitoring tool versionâ€¦

Ramesh Ramineni 1 å¹´å‰

Managing Infrastructure as Code (IaC) and Ensuring Security

Managing Infrastructure as Code (IaC) and Ensuringâ€¦

Mesut Oezdil 1 å¹´å‰

Security and Compliance in Infrastructure as Code (IaC)

Ops Work 4 ä¸ªæœˆå‰

Maintaining States in Load Balancing

Suppose your system wants to maintain states between multiple requests. Load Balancers can ensure this Statefulness. They can make sure that a particular incoming request is always routed to the same server. And the backend server can then use a Local Cache to store the metadata regarding the requests to be used at a later point of time. (We have discussed Caching in detail in one of our previous editions under this Newsletter. Do check it out.)

One possible way a Load Balancer can achieve this is by memorising the IP address of the incoming Clientâ€™s request. The Load balancer can use a constant Hash Function that can build a Hash out of the incoming Clientâ€™s IP address and the total number of available backend servers at that point of time. We can assume that the chosen Hash Function will evenly distribute the incoming requests to the backend servers.

Suppose we have M backend servers currently available and a hash function H that hashes the incoming requests on the basis of IP Address and number of backend servers (M). The hash function hashes the incoming clientâ€™s request on the basis of the IP address and later modulo the generated hash by the total number of available backend servers (M). It does the modulo in order to direct the incoming hashed request to one of the available backend servers.

Hashing the Clientâ€™s request on the basis of its IP address allows the request coming from the same IP address to be handled by the same backend server most of the times. This allows the backend server to make use of its Local Cache and reduce the service response time. Hence increasing the system performance.

But what if the number of Servers changes frequently? There can be scenarios where some servers may get worn out or multiple new servers are introduced into the system to handle the increasing load. In those situations since the number of servers have changed, our Hash Function will route the same requests to completely different sets of backend servers. This can make the Local Cache of the servers completely useless and can also increase the response time of the service and ultimately degrade its performance.

We solve this problem with the help of Consistent Hashing. It allows a minimal change in the hash function of all the earlier Requests whenever any new server is added or removed. We have discussed Consistent Hashing in one of our previous editions in this Newsletter. Do check that out to understand how Consistent Hashing works.

Concluding

I believe this article would have cleared your concepts on Load Balancing and multiple terminologies used widely under it. Do check out the other editions which were referenced in this article in order to have a more clear picture on Load Balancing. In the meantime, I will come up with more conceptual topics on System Design and Distributed Systems.

Do like and share this edition with your peers and also subscribe to this Newsletter so that you can get notified when I come up with more content in future.

Until next time, Dive Deep and Keep Learning!

Systems That Scale

30,763 ä½å…³æ³¨è€…

è®¢é˜…

Sweta Kumari

Lead Software Engineer at Synechron

10 ä¸ªæœˆ

nice content.... :)

èµž

å›žå¤

1 æ¬¡å›žåº”

Gyxi

3 å¹´

I think you should mention that if you need statefulness you probably did something wrong in the architecture of your application. In all or almost all cases, your backend should be stateless.

èµž

å›žå¤

Kuldeep Pal

3 å¹´

Loved the explanation. Waiting for Distributed system. Caching and Consistent hashing were also good. For the caching part, we can use Redis, with cache or just the Redis, or that will create some issues?

èµž

å›žå¤

1 æ¬¡å›žåº”

Sravan Kumar

SDE-2 (MTS) at Salesforce | Ex - F5 | SIH winner | Backend Development

3 å¹´

Great content Saurav Prateek ??

èµž

å›žå¤

1 æ¬¡å›žåº”

æŸ¥çœ‹æ›´å¤šè¯„è®º

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Saurav Prateekçš„æ›´å¤šæ–‡ç«

Parallel execution of nodes in LangGraph - Enhancing the performance of your graph workflows

2025å¹´3æœˆ7æ—¥

Parallel execution of nodes in LangGraph - Enhancing the performance of your graph workflows

Introduction Parallel execution of nodes is essential to speed up overall graph operation. LangGraph offers nativeâ€¦

8 æ¡è¯„è®º
Dissecting Forward Propagation in Neural Networks

2025å¹´2æœˆ15æ—¥

Dissecting Forward Propagation in Neural Networks

Introduction Forward Propagation is the process where the input parameters are passed through the Layers present in theâ€¦

2 æ¡è¯„è®º
Dissecting Backpropagation in Neural Networks

2025å¹´2æœˆ9æ—¥

Dissecting Backpropagation in Neural Networks

Introduction In machine learning, backpropagation is a gradient estimation method commonly used for training a Neuralâ€¦
A Deep Neural Network from scratch - Micrograd implemented in Java

2025å¹´1æœˆ29æ—¥

A Deep Neural Network from scratch - Micrograd implemented in Java

Introduction micrograd is an Autograd engine developed by Andrej Kerpathy. This repo covers the Java implementation ofâ€¦

5 æ¡è¯„è®º
Building Agentic RAG from scratch - A Youtube playlist

2024å¹´10æœˆ2æ—¥

Building Agentic RAG from scratch - A Youtube playlist

In this edition we will talk around my Youtube playlist on "Building an Agentic Retrieval Augmented Generationâ€¦
Tool Calling with LangChain - Do more with your AI agents

2024å¹´9æœˆ22æ—¥

Tool Calling with LangChain - Do more with your AI agents

Introduction In this edition we will understand the concept of Tool calling with LangChain. Tool Calling is the conceptâ€¦

6 æ¡è¯„è®º
Evaluating our Retrieval Augmented Generation (RAG) frameworkâ€™s performance

2024å¹´9æœˆ13æ—¥

Evaluating our Retrieval Augmented Generation (RAG) frameworkâ€™s performance

Introduction We have discussed how to create a Retrieval Augmented Generation (RAG) framework in one of our previousâ€¦

5 æ¡è¯„è®º
Hallucination in our Retrieval Augmented Generation (RAG) framework

2024å¹´9æœˆ8æ—¥

Hallucination in our Retrieval Augmented Generation (RAG) framework

Introduction In one of our previous articles we discussed how we can build a Retrieval Augmented Generation (RAG)â€¦

9 æ¡è¯„è®º
Building a Document Grader in LangGraph | Prompt Templates and Conditional Edges in LangChain

2024å¹´9æœˆ1æ—¥

Building a Document Grader in LangGraph | Prompt Templates and Conditional Edges in LangChain

Introduction In our previous article we built a multi-agent workflow that grades RAG framework performance usingâ€¦

4 æ¡è¯„è®º
LangGraph Architecture that grades RAG frameworkâ€™s performance

2024å¹´8æœˆ26æ—¥

LangGraph Architecture that grades RAG frameworkâ€™s performance

Introduction In our previous article we discussed how we can build a Retrieval-Augmented Generation (RAG) frameworkâ€¦

2 æ¡è¯„è®º

See all articles

Load Balancing - Discussing Statefulness and need for Consistent Hashing

Saurav Prateek

Engineer @ Google | Ex-SWE @ GeeksForGeeks | Authoring engineering newsletter with 30K+ Subs | 60K+ Linkedin | Content Creator | Mentor

Introduction

Health Checks by Load Balancers

Types of Load Balancers

é¢†è‹±æŽ¨è

Maintaining States in Load Balancing

Concluding

Systems That Scale

30,763 ä½å…³æ³¨è€…

Saurav Prateekçš„æ›´å¤šæ–‡ç«

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Docker in action book summary

Understanding Load Balancing in Software Architecture: A Comprehensive Guide

Ansible Solving Industry challenges.

Stress Testing in Multi-cluster Kubernetes Environments

Differences between Agent vs. Agentless Monitoring

Safeguarding Your Server from Cron Job-Induced Crashes: Insights from a Node.js Development Project Manager

Emerging Trends on Cloud-Native Application Protection Platforms

The Evolution and Importance of FTP: A Timeless File Transfer Protocol

Improving Kubernetes Security with Open Policy Agent (OPA)

System Design - Load balancers in layman's terms.

Introduction

Health Checks by Load Balancers

Types of Load Balancers

é¢†è‹±æŽ¨è

Maintaining States in Load Balancing

Concluding

Systems That Scale

30,763 ä½å…³æ³¨è€…

Saurav Prateekçš„æ›´å¤šæ–‡ç«

Parallel execution of nodes in LangGraph - Enhancing the performance of your graph workflows

Dissecting Forward Propagation in Neural Networks

Dissecting Backpropagation in Neural Networks

A Deep Neural Network from scratch - Micrograd implemented in Java

Building Agentic RAG from scratch - A Youtube playlist

Tool Calling with LangChain - Do more with your AI agents

Evaluating our Retrieval Augmented Generation (RAG) frameworkâ€™s performance

Hallucination in our Retrieval Augmented Generation (RAG) framework

Building a Document Grader in LangGraph | Prompt Templates and Conditional Edges in LangChain

LangGraph Architecture that grades RAG frameworkâ€™s performance

ç¤¾åŒºæ´žå¯Ÿ

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†

Docker in action book summary

Understanding Load Balancing in Software Architecture: A Comprehensive Guide

Ansible Solving Industry challenges.

Stress Testing in Multi-cluster Kubernetes Environments

Differences between Agent vs. Agentless Monitoring

Safeguarding Your Server from Cron Job-Induced Crashes: Insights from a Node.js Development Project Manager

Emerging Trends on Cloud-Native Application Protection Platforms

The Evolution and Importance of FTP: A Timeless File Transfer Protocol

Improving Kubernetes Security with Open Policy Agent (OPA)

System Design - Load balancers in layman's terms.

é¢†è‹±æŽ¨è

30,763 ä½å…³æ³¨è€…

å…¶ä»–ä¼šå‘˜ä¹Ÿæµè§ˆäº†