?? No need to scale out a DB if it can run on one large instance?

?? No need to scale out a DB if it can run on one large instance?

Whenever sharding over SQL databases or distributed SQL databases is brought up, some DBAs argue that horizontal scaling is unnecessary because you can run a single, large database on a single, powerful machine to handle massive workloads.

For instance, on AWS, you can provision an Amazon Aurora instance db.x2g.16xlarge with 64 vCPU, 1024 GB RAM, and 128 TB storage. You can run your massive workload on it, but that doesn't mean you necessarily want to.

Large instances can be quite expensive, and it's not always necessary to provision one upfront just in case of a peak in workload. This can happen due to business activity or a runaway query. Additionally, running such an instance all year round can be costly if your activity only peaks on a monthly or seasonal basis.

In the past, we used to do this on-premises, resulting in expensive bills for software licenses that covered a large number of mostly idle CPU resources. Oracle made a lot of revenue from idle processors because customers had to pre-provision for the highest peak of activity, plus a margin of error due to the fact that capacity planning is not an exact science.

Today, we require resource sharing, virtualization, network storage, and private or public cloud with elasticity to reduce the cost, especially when moving to the cloud.

Scaling is necessary, but do you think it's possible to scale up without scaling out? Of course, you could do it if you paused your application, but nobody wants to pause the database, especially when there's a peak in usage. For example, even with Amazon Aurora, Aurora Serverless may pause connections to migrate to another virtual machine when scaling up, and Aurora Limitless may pause write operations when re-sharding.

You not only need elasticity, but you also require High Availability. When working with a monolithic database, the only option is to provision a new standby or read replica on a larger instance and perform a switchover. However, even a brief switchover results in downtime for the application. If system downtime occurs frequently for cost reduction purposes, it can impact the SLA guarantee by adding to the unavoidable downtimes for system patching or upgrades.

It's crucial to scale your application without any downtime, particularly when the reason for scaling is an increase in high application usage. Having to stop all activity for a minute and switch over to a replica with a cold cache is not the ideal solution at this point.

To summarize, even if scaling vertically seems sufficient, you need horizontal scaling to do it. If your database cannot scale out, your only solutions are permanently provisioning a large instance, which impacts costs, or having frequent downtime to change the instance size, which impacts availability.

Distributed SQL databases are crucial for managing large databases with high throughput that must run on more than one machine. However, even for smaller databases and use cases with seasonal peaks, horizontal scalability is necessary to ensure elasticity and resilience. The idea that "you can run a lot on a single instance" is true, but ignores the cost and availability requirements.


Sikkandar Badusha

Principal Engineer - Database, Hybrid and Multi-cloud Database Strategy Specialist

1 年

Very true, JFYI we scale up (vertical - online) and scale out (requires downtime) the monolithic database using VMware. Usually we run Standby with a lesser number of cores and switchover takes around 2 minutes. This is where applications are required to be paused, but I agree it compromises high availability.

要查看或添加评论,请登录

Franck Pachot的更多文章

社区洞察

其他会员也浏览了