登录查看更多内容

Deciding how frequently to deploy

David Van Couvering

Senior Principal Architect at eBay

发布日期: 2024年10月31日

I was talking with a colleague last week about whether they should increase or decrease their deploy frequency. They were worried that deploying more frequently could cause more bugs. They also were concerned because it takes time and effort to do a deploy.

There is a well-known optimization curve in queueing theory that talks about the optimal batch size. You need to balance the holding cost - the cost of holding off delivering a batch with the transaction cost - the cost of processing a batch.

A deployment is a batch of code you are shipping.

Our code does not take up space in a warehouse, but there are still a number of significant elements to our holding costs (this is also called the cost of delay):

The cost of uncaptured value - when software isn't shipped, it's not returning any value. The longer you wait, the more money you're losing by not getting the benefits. It's also possible that the longer you wait, the less the actual benefit will be because the market is shifting or moving.
The cost of delayed feedback - when you wait a long time to ship, you're waiting to get feedback. This almost always means that the cost of rework is higher once you realize that what you shipped isn't having the impact you expected it to have
The cost to quality - the larger the batch size, the harder it is to identify the root cause when a bug occurs. This can impact customers if you aren't able to fix the problem in production, but even if you can roll back quickly, it's a ton more work for your team to figure out what went wrong an why. I love what Adrian Cockroft from AWS once said at a QCon session I went to: "when your code has a lot of bugs in production, you should be deploying more frequently."
The impact to morale - the longer and slower it is to get code out, the less engaged your team is - they just don't get the sense that they are having an impact. Also, nobody enjoys manually shepherding out a deploy rather than slinging code and getting stuff done.

Our transaction cost is about how hard is it it deploy our software. Do we need manual approvals? Do we need to manually evaluate flaky test failures? Does it take an inordinate amount of time to run all our tests? Do we need to set up our environments?

领英推荐

To Rewrite or Leave It Be What Legacy is Right For Me?

Ken Myers 2 年前

A Deep Dive into Kubernetes Core Components and etcd's…

Vikash K. 8 个月前

Announcing: Software Factory

Defense Unicorns 1 年前

If the cost of doing a deploy is high, you'll need a larger batch size - you'll be forced to have fewer deploys with more code in them.

This is why we need to be so careful about being too careful. Approval gates and slow or manual tests will force you to have larger batch sizes with all their resulting holding costs, including costs to quality.

This is why I always encourage teams to aggressively go after the things that are making their deploys expensive.

Build automated pipelines. Eliminate manual approvals. Reduce the number of flaky and slow tests using techniques such as testing in isolation, testing in parallel, having high priority tests vs additional tests you can run less frequently, and so on. Make your PRs smaller and easier to review and roll out.

Have this audacious goal: you check in some code, it goes to production, and it's a non-event.

Not only is coding more fun that way, but you'll reduce those holding costs and deliver significantly more value for your customers and your business.

David Van Couvering

Senior Principal Architect at eBay

4 个月

So I'd say: if you have a high transaction cost, then maybe slow down the deploys, but recognize the consequences: incurring higher holding costs. But the real thing to prioritize when an error budget is broken is to increase testing while maintaining or reducing transaction costs, rather than just slowing down

Suneel Saguturu

Experienced Software Engineering Leader

4 个月

Thanks David Van Couvering for sharing your thoughts on this topic. I still remember we were discussing about deploying master/main branch code to production when we were working at Castlight Health .

1 次回应

Vengada Karthik Rangaraju

Experienced Backend & SRE engineer. Engineering Leader @ Twilio

4 个月

Nice article DVC! Something I read in Google SRE book is to decide frequency of deploy based on error budgets. If frequent deploys cause error rates to go up on a service, it could be one indicator to slow things down It’s also a great way to balance/push back on Product Managers who try to ship way too many features in a short span and focus on tech debt

查看更多评论

要查看或添加评论，请登录

David Van Couvering的更多文章

Simplifying technical designs

2025年3月10日

Simplifying technical designs

Someone recently shared with me that they really appreciate my ability to take a massive, complex problem or design and…

3 条评论
Choosing a backend language, choosing a culture

2025年1月27日

Choosing a backend language, choosing a culture

Somebody was talking to me about choosing a backend programming language for their startup. I was realizing that in…

2 条评论
A set of coding standards

2025年1月11日

A set of coding standards

We have decided to focus on improving coding practices within my team, and I wanted to provide a digestible summary of…

7 条评论
How big should a service be? The age-old problem

2025年1月4日

How big should a service be? The age-old problem

It happened again. I was in a conversation with a colleague, and they were trying to decide whether to make something a…

8 条评论
Crossing the distributed systems chasm

2024年12月18日

Crossing the distributed systems chasm

A large part of my career has been helping an engineering organization evolve from a single monolithic system that…

3 条评论
Your job on ADD (AI-Driven-Development)

2024年11月13日

Your job on ADD (AI-Driven-Development)

In a recent article I mused about how AI will impact our jobs as software engineers. I was realizing things were…

8 条评论
Turn out the lights when you leave...

2024年10月6日

Turn out the lights when you leave...

I have been having some interesting conversations with my developer colleagues as they are starting to see how well the…

3 条评论
Politics and sales as a software engineer

2024年10月1日

Politics and sales as a software engineer

Politics and sales can definitely be a dirty business. Some people will say anything if it is to their advantage.

1 条评论
Changing coding habits

2024年9月18日

Changing coding habits

Over the last few years, I have been working with teams trying to help them change their design and coding habits. I am…

1 条评论
So busy but nothing gets done

2023年2月3日

So busy but nothing gets done

In my last post I talked about value streams and how we can use this concept to change how we think about building…

2 条评论

See all articles

Deciding how frequently to deploy

David Van Couvering

Senior Principal Architect at eBay

领英推荐

David Van Couvering的更多文章

社区洞察

其他会员也浏览了

Mastering Configuration Management with Kustomize

The highlights, and more!

EKS deployment Lifecycle management using Flux and GitOps principles with Terraform

Blue-Green Deployments: A Comprehensive Guide for Streamlining Software Releases with GitLab

2021: A Software Odyssey

Spotlight on OpenValue’s Technical Due Diligence Services

Docker 101

Mastering the Kubeconfig File: Kubernetes Cluster Management

From Code to Cloud: My Journey Automating Deployments with CI/CD Pipelines ??

Getting Started with Docker – Part 01

领英推荐

David Van Couvering的更多文章

Simplifying technical designs

Choosing a backend language, choosing a culture

A set of coding standards

How big should a service be? The age-old problem

Crossing the distributed systems chasm

Your job on ADD (AI-Driven-Development)

Turn out the lights when you leave...

Politics and sales as a software engineer

Changing coding habits

So busy but nothing gets done

社区洞察

其他会员也浏览了

Mastering Configuration Management with Kustomize

The highlights, and more!

EKS deployment Lifecycle management using Flux and GitOps principles with Terraform

Blue-Green Deployments: A Comprehensive Guide for Streamlining Software Releases with GitLab

2021: A Software Odyssey

Spotlight on OpenValue’s Technical Due Diligence Services

Docker 101

Mastering the Kubeconfig File: Kubernetes Cluster Management

From Code to Cloud: My Journey Automating Deployments with CI/CD Pipelines ??

Getting Started with Docker – Part 01