Why serverless makes sense ...

In the good old days, you ordered a server when you wanted to run a new application. You filled a form with the operating system and version you wanted, the amount of RAM and disk you needed and then after a few days you got a server (unless we had run out of stock and the server had to be ordered). Virtualization changed the form and gave you a just-about-functional UI on which you ordered a virtual machine and in most cases, you got a shiny new VM in a few hours. Of course, if you wanted a bunch of servers to host your new distributed caching solution, you still ordered; then checked a week later with the infrastructure team; and then kept on checking every week; till in about a quarter you got the happy news that the new blades are installed in the data center.

Cloud changed that completely; you now had EC2 (cloud was AWS in early days) and if you had the divine admin right, you could get one VM (or more) in a few minutes. Not that EC2 did not bring its own set of problems; anyone who had divine right misused their rights and you have what is now called VM sprawl. All said and done, the ability to have an elastic infrastructure from which you could get a VM on demand was liberating; for those who were not in the blessed infrastructure team. The ability to auto-scale your cluster based on load was magical; you no longer needed to pre-order hardware for the next 6 months or for the next sale/AppFest.

Which brings us to the question behind this post. If you can get VMs on demand and you can automatically scale your cluster as the load increase, why do you ever need serverless? The lambdas, the DynamoDB, AppSync, S3 - why do you need any of them? They definitely cost more than setting up your own tomcat or Mongo or mysql on a set of EC2 VMs or EKS containers. Right?

Wrong. The total cost of running a serverless infrastructure will be less than the cost of setting up your own application servers and databases on VMs. You just need to calculate the total cost properly. The total cost needs to include

  1. The cost of your VMs - obviously; and the other cluster in a different AZ.
  2. Managing your AMIs and upgrading them to the least version - oh yes, but it is not that frequent.
  3. Testing your upgrades - yeah, yeah, that is scary. But, again not that frequent.
  4. Setting up circuit breaker and/or the appropriate timeouts in your code - yeah, that is extra code but that was fun..
  5. The outage when we realized that we forgot to break the circuit in one of the paths; and spent the next two days isolating and fixing it as opposed to working on the planned items - that was definitely not fun.
  6. The day our MongoDB went down because a new query resulted in a full scan as we forgot to add that secondary index.

and so on.. and so forth..

The way we plan our sprints is we keep a separate bandwidth to handle such issues; when we forgot to take care of something. This bandwidth varies; but let us take 20% as a ballpark for adding the extra code and the troubleshooting. That is 20% of the total CTC of the team and once you add that to the overall cost, you get the full picture.

So, we will say that once more, this time in italics

The total cost of ownership of a VM-based system needs to also include the cost of including any and all code that relates to infrastructure and the time you spend on solving performance issues.

Now why does serverless not need most of this infrastructure cost? In a serverless architecture, a bad request will only impact itself. So, if a request getting processed in a lambda gets slow, it does not impact all other requests. If you make a bad query to DynamoDB that leads to a large scan, other good queries are not penalized. Contrast this with thread-dump in your stuck server, where one bad path takes up all your threads unless you properly segregate them; or one bad query to Mongo/MySQL bringing down the DB and all other queries are impacted.

This is the primary reason why serverless will win except in some niche cases - and btw, this does not include services like Kendra that are like 10X more expensive than what they should be.

So, what are the scenarios, where serverless is not a good idea.

  • When the serverless service really does not do what you want to get done. For example AWS CloudSearch, which simply does not do the job that we expect ElasticSearch to do.
  • When your context requires really high performance that needs tinkering with infrastructure - live streaming, access to raw filesystems etc.
  • When you are in the business of infrastructure management and you really need your team to be an expert in infrastructure - you could still pick and choose in this case.

I have used server-based middleware pretty much my whole life but now I am converted. Serverless should be the default answer; you can make an exception as long as you are clear why you need the exception. And if your reason is cost, double check it, and then check it once more.

Subrat Panda, PhD

CTPO @ AGNEXT | AI Expert | PhD in Computer Science | IIT KGP | Ex - Capillary |Quality Food for Billions

2 年

Very well written. Agreed with the thought process.

回复
Zafar Ahmed Ansari

Engineer ? Philosopher ! Writer ?

2 年

The assumption here is that serverless somehow automagically takes care of the maintenance pieces like circuit breakers and updates to the runtime. I somehow could not understand how serverless is reducing costs from these aspects, as the article makes no mention of that. These would still exist as added costs in the serverless world, wouldn't they?

要查看或添加评论,请登录

Akshat Verma的更多文章

  • The MVP conundrum

    The MVP conundrum

    What should be the scope of our MVP? This is probably one of the topics that is the most debated, while building a…

    2 条评论
  • Layoffs, hiring and staying true to yourself

    Layoffs, hiring and staying true to yourself

    This has been the season of layoffs - the worst we have seen in a while. A network like Linkedin has been useful as…

    5 条评论
  • Can you be an excellent architect without product management skills?

    Can you be an excellent architect without product management skills?

    One very welcome change I have seen in India is the rising trend of engineers aiming to stay technical for a longer…

    3 条评论
  • What a head-of-engineering is meant to do?

    What a head-of-engineering is meant to do?

    Recently, a person who was earlier part of my team and is transitioning to a head of engineering role asked what a…

    4 条评论
  • Visual Maps - the key to solving complex problems

    Visual Maps - the key to solving complex problems

    Visual learning has been in vogue the past few years - learning via images and videos. The logic for visual learning is…

    1 条评论
  • Resisting the urge to overfit

    Resisting the urge to overfit

    I drive back via NH8 from Gurgaon to Delhi on a daily basis. For folks familiar with the drive, the entry to Delhi…

    1 条评论
  • The real cost of feature sprawl

    The real cost of feature sprawl

    I was going through this interesting post from Ashish Kashyap on how productivity reduces when we have more people in a…

    9 条评论
  • Why "First Principles" thinking is so hard

    Why "First Principles" thinking is so hard

    "First principles" is quite in vogue these days. One can find a lot of articles that extol the virtues of "first…

    7 条评论
  • why your tech architecture needs to adapt to your team team

    why your tech architecture needs to adapt to your team team

    Many many years back, I was taking an interview for a principal engineer position. As is the norm in all my senior…

    3 条评论
  • The fine line between solving a painpoint and being creepy

    The fine line between solving a painpoint and being creepy

    Being a product manager is a hard job; being a product manager at a cross-sell company is even harder. A cross-sell…

    4 条评论

社区洞察

其他会员也浏览了