登录查看更多内容

Understanding Scalability: A Deep Dive

Vamsi Poondla

Leadership | Engineering | Architecture | Cyber | AI

发布日期: 2024年7月6日

Imagine attending a sold-out music festival featuring your favorite artist. The stage lights up, and the crowd erupts in cheers as the performer takes the stage. But what if the festival organizers underestimated the demand and booked a sound system designed only for a small intimate gathering? The audio would be muffled, distorting and losing its clarity, ruining the experience for thousands of fans.

Similarly, when software is not designed to scale, it can't handle increased traffic or user load, leading to performance issues, errors, and ultimately, a poor experience. Scalability refers to the ability of a system to efficiently handle growing demands without compromising on performance, ensuring that every user can enjoy the music (or in this case, access the software) with clarity and quality.

Scalability is just as critical as reliability (which I covered in the last edition), to the performance of a system.

The Interconnection of Reliability and Scalability

Scalability refers to a system’s ability to handle an increased load effectively. Interestingly, reliability and scalability are interconnected in many ways. For instance, a system that performs reliably for 10k concurrent users may not necessarily maintain the same performance level with 200k concurrent users.

The ‘load’ in a system can be defined by various parameters. In some systems, it could be the number of concurrent users, requests per second, writes per millisecond into a database, or the number of reads on a cache. In others, the load could be corner case scenarios.

For example, consider YouTube. A popular channel like Mr Beast’s has 288M subscribers, which means certain parts of the system, such as new video notifications, must handle this extreme load. In contrast, an average channel may have fewer than 10 subscribers. Similarly, on X/ Twitter, making a tweet could scale linearly, but the fan-out of the tweets to all followers could be a heavy load operation.

Performance Metrics: Percentiles over Averages

Batch systems usually measure performance by throughput, while online systems measure performance by requests per second. Performance metrics are typically measured as percentiles rather than averages. They are represented as follows:

50 Percentile notated as P50/ Median
95 Percentile notated as P95
99 Percentile notated as P99
99.99 Percentile notated as P9999

The reason for this is to understand outliers, particularly the tail latencies like P99. For instance, in a retail website with 100M active users per week, if an average order with under 10 items gets executed, 99.99 percentile at 300ms, this means the 1st decile, 0.01 percentile users (about 1M) will have a latency higher than 300ms. Greater than 300ms could be 2 seconds or 10 seconds. If these 1M users happen to be heavy buyers with large numbers of items in the cart or with high-value items, yet in smaller numbers in the cart (like those Diamond Rings sold at the Costco). This means we are now affecting revenue of the business with these tail-end latencies. Perhaps additional backend anti-fraud check may be causing this latency, but the end user experience may suck.

Designing for Scalability

Some approaches to design for scalability include:

Christine Osazuwa 2 年前

A Voice Reborn: AI Brings Randy Travis Back to the…

ChandraKumar R Pillai 6 个月前

Reeperbahn: Meet the International Music Industry ????

Mike Warner 1 年前

Vertical Scaling: Adding more resources to a single node (like a more powerful CPU, more Memory, low latency disks like flash drives, reducing RAID)
Horizontal Scaling: Adding multiple nodes and fronting a load balancer to distribute the requests.
Hybrid: Keeping a mix of both these strategies
Elastic scaling: A system which automatically adds capacity and images the functionality based on the load.
Predictive scaling up: To ensure seamless performance during peak periods, we employ predictive scaling strategies. This involves proactively adding nodes or capacity ahead of anticipated high-traffic events, such as major shopping holidays (e.g., Black Friday), global online sales promotions (like Amazon Prime Day or Singles' Day in China), or popular sporting events (like the World Cup). By doing so, we can confidently anticipate and accommodate increased demand, providing a level of predictability that's unmatched by elastic scaling models. This approach offers operational simplicity and minimizes the risk of last-minute scrambles to scale up and maintain performance during critical periods.
Caching: Techniques such as offloading static content to the Content Delivery Networks (CDNs), moving processing to the clients, distributed caching can be used to use compute only when necessary.

It is to be noted that there is no one-size-fits-all solution. Each application requires a unique design, informed by a deep understanding of its users' characteristics, usage patterns, and underlying assumptions. For instance, a streaming service will require a distinct solution from a payment tech system, which in turn differs significantly from a social media application. As our assumptions evolve or are disproven by changing user behaviors or market demands, it is crucial that we remain vigilant and prepared to reevaluate and evolve our architecture accordingly.

So, how do we measure Scalability?

When evaluating the scalability of a system, it is essential to consider reliability alongside this key performance indicator. To achieve optimal results, we must measure and analyze various metrics under simulated and actual load conditions.

The following quantified metrics are crucial in assessing a system's scalability:

Error rates and crash rates: Identifying potential bottlenecks and areas for improvement.
Response time for key use cases: Ensuring that critical functions operate efficiently.
Requests/transactions per second: Measuring the system's ability to handle high volumes of traffic.
Resource utilization: Monitoring CPU, memory, and other resource usage to prevent overload.

In addition to these metrics, we also continuously test and monitor any auto-scaling or pre-scaling up models in place. This ensures that our systems adapt effectively to changing demands.

However, not all systems scale linearly due to various choke points along the request control flow. In such cases, it is essential to have a thorough understanding of these limitations to architect and budget resources accordingly. For instance, if network bandwidth becomes a bottleneck, adding additional bandwidth may be a relatively cost-effective solution.

By adopting this quantifiable approach to scalability, we can ensure that our systems are not only highly available but also able to handle increased demands with confidence.

This is a perfect point to pause and go to the next topic - Maintainability, which I will cover in my next article.

Understanding Scalability: A Deep Dive

Vamsi Poondla

Leadership | Engineering | Architecture | Cyber | AI

The Interconnection of Reliability and Scalability

Performance Metrics: Percentiles over Averages

Designing for Scalability

领英推荐

So, how do we measure Scalability?

Further Reading

更多精彩文章

社区洞察

其他会员也浏览了

Introduction to the Contemporary Music Industry

Impact of AI Music Generators on the Music Industry: What Does Science Say?

Music Industry Insights & Connections

Background Music Market Explosive Growth Seen Ahead with Rising Demand | Almotech, NSM Music., CSI Music

Exploring Global Music Trends: Key Insights from Sandbox Summit 2023 ?????? (part 1)

The Future is Now: Generating Music with AI

Sedeck Jean Partners with Bandlab to Revolutionize Music Creation.

The Music Industry’s AI Revolution: Navigating a New Soundscape

5 Ways to Use QR Codes in the Music Industry

Top Productivity Tools for Music Professionals in 2024

The Interconnection of Reliability and Scalability

Performance Metrics: Percentiles over Averages

Designing for Scalability

领英推荐

So, how do we measure Scalability?

Further Reading

Maintainability - the third architectural pillar

2024年7月16日

Architecting Reliable Platforms for the AI driven future

2024年6月26日

70-hour work week

2023年11月16日

Okta Breach - Thoughts and Recommendations

2023年10月21日

Navigating the Modern Technology Landscape: From Standard Stacks to Strategic Choices

2023年9月8日

Part 2 - Navigating Disruption: Forces of Change for Engineering Excellence

2023年8月23日

It can't be bargained with. It can't be reasoned with. It doesn't feel pity, or remorse, or fear. And it absolutely will not stop... ever, until...

2023年8月2日

Choosing the Right Technology and Methodology for Product Development

2023年7月19日

Elon Musk on Productivity

2018年4月18日

社区洞察

其他会员也浏览了

Introduction to the Contemporary Music Industry

Impact of AI Music Generators on the Music Industry: What Does Science Say?

Music Industry Insights & Connections

Background Music Market Explosive Growth Seen Ahead with Rising Demand | Almotech, NSM Music., CSI Music

Exploring Global Music Trends: Key Insights from Sandbox Summit 2023 ?????? (part 1)

The Future is Now: Generating Music with AI

Sedeck Jean Partners with Bandlab to Revolutionize Music Creation.

The Music Industry’s AI Revolution: Navigating a New Soundscape

5 Ways to Use QR Codes in the Music Industry

Top Productivity Tools for Music Professionals in 2024