System Design: Load balancers
Thanks to the original creator: https://medium.com/geekculture/system-design-basics-load-balancer-5aa1c6b0f88d
What is Load Balancing?
Load balancing refers to efficiently managing traffic across a set of servers, also known as?server farms?or?server pools. Each load balancer sits between client devices and backend servers, receiving and then distributing incoming requests to any available server capable of fulfilling them.
What are Load Balancers?
A load balancer is a device that acts as a reverse proxy and distributes network or application traffic across several backend servers. It is used to increase the concurrent capacity of a distributed system by increasing availability and performance. It improves the overall performance of applications by decreasing the burden on servers associated with managing and maintaining application and network sessions, as well as by performing application-specific tasks.
Load Balancers are categorized into two different groups: Layer 4 and Layer 7. Layer 4 corresponds to load balancing on a network level (Network & Transport Layer), optimizing the flow of packets through protocols like IP, TCP, FTP, etc. Layer 7 works on an application level, optimizing HTTP requests, APIs, etc.
A load balancer may be:
Load Balancing Algorithms
Some of the algorithms used for load balancing are:
Sticky Session
Session stickiness, or session persistence, is a mechanism by which load balancers can couple the requests to the backend systems. This ensures that different requests for the same session can be processed by different servers without loss of information.
The advantage of sticky sessions is that servers within the distributed system don’t need to interact between them. Each system can work independently. Also, there is an added advantage of RAM cache utilization which results in better responsiveness.
But this is not without its cons. A server may become overloaded with too many sessions or might result in data loss if servers are shifted mid-session. There’s also the added latency added due to one central load balancer.
Elastic Load Balancers
An?Elastic Load Balancer (ELB)?can scale load balancers and applications based on real-time traffic automatically. ELB automatically distributes incoming application traffic across multiple targets and virtual appliances in one or more Availability Zones (AZs).
It uses system health checks to learn the status of application pool members (application servers) and routes traffic appropriately to available servers, manages fail-over to high availability targets, or automatically spin-up additional capacity.
ELBs scale your load balancer as traffic increases. The load balancer acts as a point of contact for all incoming requests, and monitoring the health of the instances distributes load among them.
Elastic Load Balancing automatically distributes incoming application traffic across multiple server instances. It enables you to achieve greater levels of fault tolerance in your applications, seamlessly providing the required amount of load balancing capacity needed to distribute application traffic.
Elastic Load Balancing detects unhealthy instances and automatically reroutes traffic to healthy instances until the unhealthy instances have been restored. Customers can enable Elastic Load Balancing within a single or multiple Availability Zones for more consistent application performance.
ELBs can be configured at three levels in a system:
Happy Learning!