What are Load Balancers ???
Aniket Singh
Founding Engineer at barq | Building innovative fintech solutions for global markets
Over the years, load balancers have become an essential pillar of system architecture, serving as the front-line controllers that intelligently distribute incoming network traffic across multiple servers.
Similar to an air traffic controller coordinating the safe landing of planes by assigning them to available runways, load balancers regulate data flow within a system, ensuring horizontal scalability and optimal resource utilization.
Before we dive deeper, here's a quick cheat sheet summarizing today’s article ??
Let’s understand each component in detail:
Functions of a load balancer
Types of Load Balancers
Types of load balancing algorithms
1. Round Robin
How does it work?
Requests are distributed sequentially across servers, looping back to the first server after reaching the last one.
In Figure 2, requests 1, 2, 3 and 4 go to servers 1, 2, 3 and 1 respectively.
Pros & Cons
? Simple to implement and effective for servers with equal capacity.
? Doesn't account for server load or processing power.
Example: Default algorithm used by Kubernetes?Load Balancer for routing traffic to service pods.
2. Sticky Round Robin
How does it work?
The first client request goes to a random server, and all future requests stick to that server unless it's unavailable.
In Figure 3, requests 1 & 2 go to server 1 while requests 3 & 4 go to server 2.
Pros & Cons
? Ensures session persistence, ideal for applications requiring user state retention.
? Not suitable for applications with short-lived sessions.
? If a server goes down, session data can be lost unless state is stored centrally.
Example: Redis Cluster uses this to direct a client’s requests to the same Redis node, ensuring consistent access to cached data.
3. Weighted Round Robin
How does it work?
Servers are assigned weights based on their capacity, with higher-weight servers handling more requests.
In Figure 4, Out of 4 requests, 3 requests (0.8 4 = 3.2) go to server 1 and 1 request (0.1 4 = 0.4) goes to server 2.
领英推荐
Pros & Cons
? Balances load based on server capacity.
? More efficient for systems with servers of varying strength.
? Doesn’t factor in real-time load (active connections).
Example: AWS Elastic Load Balancer uses this to direct more traffic to EC2 instances with higher capacity and computing power.
4. IP Hash
How does it work?
Calculates a hash value from the client’s IP address and uses it to determine the server to route the request.
In Figure 5, requests 1 & 2 are from same IP in China, hence go to server 1 where hash(IP)=0, while requests 3 & 4 from same IP in India go to server 3 where hash(IP)=2.
Pros & Cons
? Ensures consistent routing of requests from the same client to the same server.
? Good for session persistence without needing cookies.
? Can lead to uneven load distribution if many clients share similar IP addresses.
? IP changes can break persistence.
Example: E-commerce platforms like Amazon use this to ensure that?the same server handles all cart requests for a user, such as adding, viewing, or removing items from the cart.
5. Least connections
How does it work?
Incoming requests are directed to the server with the fewest active connections.
In Figure 6, all requests go to server 2 because it has the least no of client connections (50).
Pros & Cons
? Balances real-time traffic load efficiently.
? Prevents servers from becoming overloaded.
? Doesn’t account for server capacity or request complexity.
Example: PgBouncer distributes queries across PostgreSQL instances based on least connections, preventing overload on any single instance.
6. Least Response Time
How does it work?
Incoming requests are routed to the server with the fastest response time.
In Figure 7, all requests go to server 1 because it has the least current response time (1ms).
Pros & Cons
? Reduces latency for time-sensitive applications.
? May lead to instability if servers experience temporary performance spikes.
Example: Booking platforms (eg: Expedia)?use this at the API/ingress gateway level to route travel booking requests to the fastest available server.
Thanks for reading! If you liked the article, please subscribe to my System Design Newsletter to keep on receiving more such breakdowns and diagrams.
Founding Engineer at barq | Building innovative fintech solutions for global markets
6 个月If you liked the deep dive, subscribe to my weekly system design newsletter - https://systemdesignnewsletter.substack.com/ ??
[Immediate joiner] Cloud/DevOps/SRE Roles | MCT | LiFT Cloud Captain | SUSE Scholar '21 | OSS-ELC '20 Scholar | Former Fedora Contributor | ?? DevSecOps | Tech ???? | Community ?? | Public Speaker ???
6 个月Well done!