Scaling for Success: The Load Balancing Journey of a Fictional Startup
Zahiruddin Tavargere
Senior Principal Software Engineer@Dell | Opinions are my own
The Context
We will use the journey of a fictional startup to learn how load-balancing decisions are made at every phase of the growth of a startup.
The journey is oversimplified to cater to a wide variety of readers.
The evolution of the architecture of the startup is out of the scope of this article.
Fictional startup: ProScheduler
Company size: 4
The fictional startup we'll be discussing in this article is a SAAS (Software as a Service) company that specializes in schedule management tools.
The company started small, with a few hundred users and a basic web application that helped individuals and small businesses manage their schedules more efficiently.
However, as the company's user base grew and its product matured, it quickly realized it needed to scale its infrastructure to handle the increased traffic.
At every phase of its growth, the company learned different Load Balancing techniques and scaled its app accordingly.
Phase 1: The Beginning
Traffic: ~300 requests/hour
Load Balancer: -
In the beginning, the startup had a small team and a basic web application. They didn't anticipate a lot of traffic. Their application, hosted on a single server, was easily supporting 300 requests/hour.
Traffic: ~5000 Requests/hour; Spikes on weekends
Load Balancer: DNS-based load balancer
As the number of users grew, they would see occasional spikes on weekends and Mondays, when the customers plan for the week.
The spikes were overwhelming the single server and they needed a way to ensure their customers are not impacted.
They introduced another server to balance the load.
What is load balancing?
Load balancing is a technique used to distribute workloads evenly across multiple servers, in order to optimize resource usage, minimize response time and avoid overloading any single server.
This can be accomplished in several ways such as
ProScheduler used a simple round-robin DNS load balancer to distribute traffic among the 2 servers.
To understand DNS-based load balancing, it is important to understand how DNS lookup works
How DNS lookup works
How does the DNS have records of both servers?
In DNS-based load balancing, the DNS server is configured with multiple records for the same domain name, each pointing to a different server.
The DNS server is typically configured by the system administrator or network administrator of an organization.
If a website is hosted by a 3rd-party hosting service provider like GoDaddy, the provider will typically set up and configure a DNS server on the customer's behalf and provide them with the necessary information to point their domain name to the provider's server.
This load balancer worked well for ProScheduler in the early stages as it was easy to set up and required minimal maintenance.
However, as the startup began to gain traction, it became clear that it needed to scale its infrastructure to handle more traffic.
Phase 2: Scaling up
As the startup began to gain more users, it needed to scale its infrastructure to handle the increased traffic.
It explored different types of load balancers and load-balancing techniques.
What are the different types of load balancers?
They opted for software-based Layer 4 load balancers, such as NGINX, which could handle more connections and provide basic load-balancing capabilities.
These load balancers work at the network level and can distribute traffic based on IP address and port number.
Traffic: ~ 500-1000 requests/second
Load Balancer: Software Load Balancer (NGINX)
What is NGINX?
NGINX (pronounced "engine-x") is an open-source, high-performance web server and reverse proxy software.
领英推荐
It is often used as a web server, load balancer, and reverse proxy, and it can also be used as a mail proxy and HTTP cache.
NGINX is designed to handle a large number of concurrent connections, making it a popular choice for high-traffic websites and web applications.
How does NGINX work?
The NGINX architecture can be divided into 3 main components
Phase 3: Growing pains
As the ProScheduler's user base grew and the demand for its schedule management tool increased, it began to experience growing pains in terms of scalability and performance.
To address these issues, the startup decided to adopt a microservices architecture.
As the number of users and requests increased, the startup began to see heavy traffic on certain services, such as the user management service and the scheduling service.
To ensure that these services could handle the increased traffic, the startup implemented an F5 BIG-IP load balancer.
The F5 BIG-IP load balancer is used to distribute incoming traffic among multiple servers running each service.
By using a Layer 7 load balancer, the BIG-IP is able to route traffic based on the content of the request, rather than just the IP address or port.
This allows the BIG-IP to route traffic to the appropriate service based on the functionality being requested.
To handle the increased traffic, the startup configured the BIG-IP to use specific load balancing algorithms such as Least Connections and IP Hash, which are optimized for handling high-traffic loads.
Traffic: ~ 2000-3000 requests/second
Load Balancer: Hardware Load Balancer (F5 BIG-IP)
What is F5 BIG-IP?
F5 BIG-IP is a hardware-based load balancer and application delivery controller (ADC) that helps to improve the availability, performance, and security of web applications.
Some terms to understand.
The basic application delivery transaction is as follows:
What are the different types of load-balancing algorithms?
There are several load-balancing algorithms, the most commonly used are...
The Layer 7 load balancer provided several benefits for the startup.
For one, it allowed them to distribute traffic based on more advanced rules, which helped to improve the application's performance and reduce downtime.
Additionally, it provided more advanced load balancing capabilities, such as content-based routing and rate shaping, which helped to distribute traffic more efficiently.
Phase 4: Scaling globally
As the startup became a popular destination on the internet, it needed to further scale its infrastructure to handle even more traffic.
They implemented a multi-tier load balancing architecture, with a Layer 7 load balancer in front and multiple Layer 4 load balancers behind it, to distribute traffic to multiple web server clusters.
Traffic: ~ 10000 - 15000 requests/second
Load Balancer: Multi-tier Load Balancer (F5 BIG-IP + NGINX)
This approach allowed them to improve the performance and reliability of their application while also reducing downtime.
The multi-tier load-balancing architecture provided several benefits for the startup.
For one, it allowed them to distribute traffic more efficiently across multiple web server clusters, which helped to improve the application's performance and reduce downtime.
Additionally, it provided more advanced load balancing capabilities, such as content-based routing and rate shaping, which helped to distribute traffic more efficiently.
Phase 5: Expanding globally
As the startup expanded globally, it faced new challenges in terms of managing and distributing traffic effectively.
To address these challenges, the startup implemented a combination of a traffic manager, a Content Delivery Network (CDN), and regional data centers.
The traffic manager, such as F5 BIG-IP Global Traffic Manager (GTM), is a DNS-based load balancing solution that allows the startup to manage and distribute traffic globally by directing clients to the closest or best-performing server based on their location.
The startup also implemented a CDN (they could also use 3rd party vendors), which is a network of servers that are distributed across multiple locations around the world.
To further improve performance and availability, the startup also set up regional data centers in strategic locations around the world.
This allows the startup to store data and run services closer to users, reducing the latency and improving the performance of the application.
In conclusion, the fictional startup's journey illustrates the importance of load balancing and traffic management in ensuring the availability and performance of web applications and services, and how the solution has to be dynamic and adaptive to the growth and the challenges that come along with it.
I write about System Design, UX, and Digital Experiences. If you liked my content, do kindly like and share in your network. And please don't forget to 'follow' for more technical content like this.