登录查看更多内容

Chasing the elusive Continuous Deployment

Thomas (Tom C) Chmielewski

Vice President of Product Management - Improving Existing Portfolios, and Designing & Launching New Products & Services

发布日期: 2018年6月20日

How many of my Product Management colleagues still deliver releases on a quarterly (or even longer) release cycle. Most of the teams I managed did just that.

Yet a surge of companies have moved to continual releases outdistancing their competitors. Google, Macy’s, Amazon, Facebook, Etsy, Target, Nordstrom, and Netflix routinely and reliably deploy code into production hundreds, or even thousands of times per day.

What? Not possible! HOW? THAT CAN’T BE !!

The DevOps Handbook explains how to replicate these incredible outcomes, by showing how to integrate Product Management, Development, QA, IT, Ops, and, InfoSec to improve delivery and better win in the marketplace.

There are three Phase to accomplish this:

The principle of FLOW
The principle of FEEDBACK
The principle of CONTINUAL LEARNING AND EXPERIMENTATION

Here is my three part article summary of the process.

Part one is about the principle of FLOW.

Phase 1=THE PRINCIPLE OF FLOW – the theory of constraints

Most of us start off in our status quo job of long hours, weekend work, a backlog of technical debt, never seeming to catch up, too many requirements, not enough sprints. It seems like we work towards opposing goals, feeling powerless, followed by burnout, with the associated feeling of fatigue, cynicism, and even helplessness and despair.

Let’s talk about HP’s LaserJet Firmware division – they have 400 developers. They weren’t getting much new development done. Marketing/Product Management had hundreds of product ideas each year. Development said ‘we can do two – we only have capacity for two of your ideas.’ Does this sound like a familiar tune? HP went through the DevOps transformation. They moved from working on just 5% of features to being able to allocate 40% of capacity working on new features. We would all love that delivery velocity.

Let’s start.

Phase 1 - The principle of FLOW

The first step is to map out the entire sequence of events from identifying a feature or customer request all the way to delivery to the client (NOT simply deploy, but include the client implementation/use as well, because deployed code not used is like merchandise on a shelf/not sold – it is not in the customer’s hands yet). We typically follow the standard process – get ideas from customers, maybe at the annual user conference, sometime later we write them up and turn them into requirements (Epics) then break them down into stories, then put them into the backlog, then prioritize them, then groom them, start to code them and then get asked a ton of clarifying questions, then test, then merge, then test again, then deploy. And then have a fix-it / maintenance release. There are ways to speed this up.

Google had this scenario. They had infrequent code deployments. After the transformation they went from infrequent code changes, to 40,000 code commits a day – 50,000 builds a day! And we all know Google stuff works in production every day. We all use it…. If they can make the change, you can too.

Step two is to ensure your environment is consistent. In order to make the process work you will need production like environments at every step of the way. QA servers exactly match production servers. Dev matches QA. NO excuses. This isn’t the 1970s; hardware is inexpensive compared to teams of developers and production failure possibilities. Get the team to build scripts which in turn build environments automatically. Servers should be built in 5 minutes (I worked for a company where it took two months to get a server built – insane). Version control is more important for operations than for development due to a magnitude more configuration settings. Then take any server fix’s (fix-forwards) and always move them back into trunk. It should be easier to build a new server than to fix one. This is the puppies / cattle discussion. Some companies treat a server like a puppy, and try everything they can to get the server right, and to keep it right. Other groups treat a server like cattle – just shoot it & build a new one. (No animals were harmed in this discussion, and, it’s not my metaphor so please don’t send complaints to me). If you have the standard scripts along with version control it is far easier, quicker, and safer, to build a new server than try and fix one. How long do your servers live for? How long does it take to get a new server?

I get movies through Netflix. The average life of a Netflix AWS server is 24 days, most of them just a week. Netflix routinely kills and replaces production instances of servers, jut to prevent configuration drift. That ensures that the servers are all the same (no snowflakes servers (every snowflake is different)). This ensures manually applied changes/fixes aren’t propagated forward and persisted.

Now that you have your environment set, step three is to build a fast and reliable automated test suite. That is what Google had to do. Google has over 120,000 automated test scripts – they run 75M test cases daily. None of the “log in and play around and see if anything looks funky and make check marks on a spreadsheet and we will see if it is a bug or not and try and fix it” stuff that many companies do. You need Test Driven Development; TDD.

You need to catch errors via automated testing as early as possible. Run the tests quickly and in parallel if possible. In Test Driven Development you write the automated tests before you write the code. Automate as many of the existing manual tests as possible. You need to integrate the performance testing into the test suite as well. As an example, non DB indexing page loads could grow from milliseconds to thirty seconds and if the code has multiple DB calls the network traffic could increase tenfold. And, contrary to popular believe, TDD coding is efficient. IBM Almaden Labs determined that TDD code was 60%-90% better in terms of defect density than non TDD code, while only taking 15%-35% longer time. So 15% longer time = 90% better code. A big win. Macy’s went from executing 1,300 manual tests every ten days to ten automated tests for every code commit. Yes, Macy’s as in the department store. If their IT shop can make the transformation, your company could too.

Step four – move from a monolithic code base to a modular code base. I thought we learned this in the 1990s with object oriented programming. Coupled architecture can impede everyone’s productivity and ability to make changes safety. You know, the scenario of “if I make a change here, I am not sure what else it will affect, and where it will be affected” paradox. A loosely couple architecture with well-defined APIs that enforce how modules connect with each other promotes production safety. I know that if I stay within the API for this module, and with my automated test suite, I can safely make changes without screwing up anything else. THIS is how we get to multiple deployments in a day. Etsy starts their process at 8AM using a chat room for coordination. They run 4,500 unit tests in one minute; 7,000 automated regression tests in about eleven minutes. Etsy practices continuous development/continuous deployment.

Step five – finally when you deploy, do what Facebook does – run a canary test – deploy to a small set of live servers. If it works well for X period of time, then deploy to the rest of the thousands of servers. CSG International, one of largest bill printing companies in United States, runs their services hundreds of times a day with realistic data and traffic before going into production. They got their ‘development to production’ time down from two weeks, to daily. Eventually, deployments became so routine that the Operations team was playing video game at end of day. Production incidents were down 91%, and MTTR was down 80%.

So, Phase 1 is THE PRINCIPLE OF FLOW – the theory of constraints. Understand your ideation to deployment flow. Remove constraints. Work towards a loosely coupled architecture, with Test Driven Development, and the automation of building of servers. Test early and often, and work towards deploying frequently. I hate to say it but this isn’t rocket science – I have been in this industry for over twenty years – what your management team needs to accomplish this is a conviction to do it, and discipline to execute it.

Phase 2 – the Second Way – The Technical Practices of Feedback is next week.

For more information go to :

2017 State of DevOps report

https://puppet.com/resources/whitepaper/state-of-devops-report?pcnav=off&pctiles=off&ls=Campaigns&lsd=Search&cid=7010f0000017Khn&gclid=CjwKCAjw06LZBRBNEiwA2vgMVVvqja9kNXKMRmnoIOKospZAIJB72iMPF4x7Ei2QhYjNwcWBn8OiehoC9wMQAvD_BwE

Binh NGUYEN

Infrastructure Leader (DataCenter & Cloud Operations, Azure/AWS/GCP/OCI Certified, Security, FinOps, Service Delivery Management, SAP Basis)

6 年

I already order this Devops handbook ...waiting with impatience its delivery !

1 次回应

Thomas (Tom C) Chmielewski

Vice President of Product Management - Improving Existing Portfolios, and Designing & Launching New Products & Services

6 年

For those of you who haven't taken the time to read the book....

2 次回应

查看更多评论

要查看或添加评论，请登录

Thomas (Tom C) Chmielewski的更多文章

Are you measuring the right Product Management metrics??

2020年3月12日

Are you measuring the right Product Management metrics??

Measuring your product performance is vital for determining the health and success (or failure) of a product and is…
Tales of Product Management - Who just joined the call??

2019年8月29日

Tales of Product Management - Who just joined the call??

Tales of Product Management - Who just joined the call?? Who just joined the call? A lot of us hold conference calls on…

4 条评论
The Disorganized Organization

2019年4月25日

The Disorganized Organization

Have you ever worked at the disorganized organization? Just one simple change, vastly improves everything. One of my…

1 条评论
Chasing the elusive Continuous Deployment Part 3 - The Third Phase

2018年7月12日

Chasing the elusive Continuous Deployment Part 3 - The Third Phase

The Phoenix Project and the DevOps Handbook Part 3 of a three part series. How many of my Product Management colleagues…

2 条评论
Chasing the elusive Continuous Deployment Part 2 The Second Phase

2018年6月28日

Chasing the elusive Continuous Deployment Part 2 The Second Phase

Second in a series of three articles on DevOps The Phoenix Project and the DevOps Handbook How many of my Product…
Getting to the elusive continual deployment

2018年6月20日

Getting to the elusive continual deployment

The Phoenix Project and the DevOps Handbook How many of my Product Management colleagues still deliver releases on a…

1 条评论
this is a true story

2017年6月1日

this is a true story

Please don’t let this happen on your watch. Tomorrow is June 1, first day of Hurricane Season which lasts through…
Leader - Leader OR Leader - Follower

2017年2月21日

Leader - Leader OR Leader - Follower

TURN THE SHIP AROUND!! Which of these two organizational structures does your company fall into?? Last weekend I had…

4 条评论
Is this a credit or debit card?

2016年12月4日

Is this a credit or debit card?

As we enter the holiday season we will be using our credit cards and debit cards - but did you know there are subtle…

2 条评论
Consumer benefits of Credit and Debit

2016年2月11日

Consumer benefits of Credit and Debit

As society continues to move towards an essentially cashless market, we use credit and debit cards daily. But did you…

1 条评论

See all articles

Chasing the elusive Continuous Deployment

Thomas (Tom C) Chmielewski

Vice President of Product Management - Improving Existing Portfolios, and Designing & Launching New Products & Services

Thomas (Tom C) Chmielewski的更多文章

社区洞察

其他会员也浏览了

Flow Signals and Flow Metrics

In-Depth Analysis of the Best Product Development Practices Adopted Globally

Our Modern Practices for Product Development. We are now Transitioning from Traditional SDLC to Linear-Based, AI-Ready Product Development.

Product Management Daily Insights for September 25, 2024

The Journey from Ideation to Delivery: Mastering Lead Time and Cycle Time for Agile Product Development

Product ?? Dev: Building trust

Comparing Metrics, Outcomes, and Flywheels in Product Management

Mastering Product Management in Essential Skills and Strategies

Product Management Daily Insights for September 30, 2024

Thomas (Tom C) Chmielewski的更多文章

Are you measuring the right Product Management metrics??

Tales of Product Management - Who just joined the call??

The Disorganized Organization