Why releasing code many times a day is better than a big bang once a month.
Update [28/07/2020]: It’s been a year and change since I wrote this and I’ve been asked a few times how much has changed. Rather than re-write this article, I’ve added an addendum to the end as this is very much a story that we continue to evolve as we grow and learn as an organisation.
We follow a strict policy of shipping code into production regularly, it is not uncommon for us to release new code into production 15 times a day. Deploying code regularly is something that is considered best practice with other fintechs like us but it’s absolutely not a race, this is about maintaining code quality and avoiding legacy. For a long time, we have been able to shape our direction without inspection, but as we’ve begun working with partners and investors, we’ve had to describe the meaning of continuous deployment a few times and to that end, I thought I’d explain why we favor releasing code many times a day.
No one starts a business with the objective of shipping code 10 or 15 times a day, this is a function that is built up over time. It is a process that you have to expect to maintain in the form of software and hardware as the code-base increases, but also the engineering team grows. More people, more code, more releases.
As our business has grown, we’ve begun to work more closely with other incumbents, firms with a significant and respected presence in the industry. Their history is to be admired, but it also means they face this same technology challenge from a different perspective.
Change control is a hot topic in many organisations, it typically requires a dedicated team to work with third-party vendors and internal stakeholders following a prescribed set of regression tests before the new code can be released into production. That division of labor, first write the code, then hand over the work to fully test it means the human effort needed to complete the process is such that change has to be introduced slowly. As you delay the process, more features stack up, increasing the impact to production and the greater the impact, the greater the risk, hence a big bang release.
When we started Smart (five years ago today!), we didn’t have to navigate a legacy code base, we started with a clean sheet which afforded us the luxury of writing our own rules with regard to code quality. We also chose to build our applications in-house allowing us full control of our architecture and coding standards. As the first line of Smart code was written it was submitted to a code repository along with a line of automated test coverage. This methodology is known as Test Driven Development (TDD). The principle being that you write unit tests as part of writing the associated executable code and each time you deploy the next new feature, you run your past tests at the same time to ensure your new code doesn’t break the existing code that is running in production. This is known as Continuous Integration (CI), the more code you write, the more test coverage you have integrated with your application. An external server hosts your growing test suite and becomes the guardian of quality, allowing or preventing code moving forward into production. This doesn't remove the need for human verification, but it greatly reduces that step.
Having a high level of test coverage means you can deploy with confidence to production more often, but importantly introduces the need for your engineers to keep the code they are working on current. As you write code and tests alongside other engineers you need to ensure your code is on par with other code adjacent to yours. If another engineer is working on a feature branch that is similar to yours but is ahead of yours in the process you might need to rebase your code (update it with the latest changes) to pass the CI suite. To avoid the need to rebase code it is important to commit changes to the master branch and deploy into production regularly. This little and often approach is known as Continuous Deployment (CD).
This upfront cost of front-loading your test coverage like this sounds expensive but in-fact pays huge dividends later as your business and therefore code base grows and you want to release features more often. It’s also something that third-party vendors charge more for as they know they make good money on the consultancy and headcount for the fixed regression tests that have been paid for as part of a contract. As contrary as it sounds, it pays them more to avoid best practice, thus embracing a path to legacy.
In the early days of Smart, we had a small team of five engineers and quickly doubled to ten. Ten is a small team size, but it is an important milestone in maintaining the CD pipeline. You are still able to lean across a desk and discuss a deploy with a colleague. Your deploy count is therefore relatively low as your combined engineering and QA output mean new releases into production are many a week, not many a day.
As we doubled again to twenty we moved to shipping code many times a day and this required more roles, more QA’s and more people to manage the teams. This is an interesting test of the team's ability to scale, you need more engineers to write more code, but you also need more QA’s to check the feature and people to carry out the acceptance testing. You can’t have one without the other and all links in the chain have to pull in the same direction.
As you put in the support structure you are able to grow horizontally, replicating teams by skill set. It wasn’t until we passed 30 engineers that we introduced a Head of Engineering. This enabled that horizontal scaling just at a point where we were expanding our engineering team to include two nearshore offices abroad.
I had actually been rather against the concept of farming our code base off to other non ‘Smart’ people to work on. Perhaps overly protective of that code quality objective, I didn’t want to lose sight of our guiding principles as we grew. So we flipped the model on its head, we flew to each location and individually interviewed, tested and recruited each person that works in our nearshore teams. We followed the identical process we use when recruiting here in the UK. After we found these people, they were flown over to the UK office so they could be onboarded in a practical, contract and laptop sense, but also in a cultural sense. We wanted each new person, regardless of geography to be part of our business. To think and work as we do. Everything from the daily stand-ups to celebrating achievements in a bar.
This horizontal scaling of approach and management structure allowed the engineering team to double in size again to 60. That meant we were writing more lines of code but also more lines of test coverage and enforcing an increase in the rate of deploys to production to ten a day.
This CI/CD process also means we can onboard a new engineer and know they can ship quality code into production a few days after they start.
Whilst teams might still lean across a physical (or virtual) desk today to discuss a deploy with a fellow engineer all deploys (once approved) are automated and monitored via a number of integrated services that we report on. We also use GitPrime to measure various KPI’s we want to track in this collative process. This ensures we can measure our progress but also that all-important code quality goal.
As the business has grown we have needed more features and more engineers. Each growth spurt has nudged up the per day deploy count as we’ve held onto our code quality objective. Today we have more lines of test coverage than we have lines of executable code thus empowering our CD pipeline and meaning we can deploy in confidence more often, which last time I checked was at 15 deploys a day.
As I said before, the deploy count is not a race nor is it is a measure of success. It is simply indicative of the multiple of people we have working on our code-base at any one time whilst maintaining a high standard of test coverage.
Last year we won an award for our continuous delivery process (yes, I was as surprised as you, who knew that was a thing!), we beat the likes of Lloyds Bank and Ocado and I won’t deny it was a fun night out with the Smart engineering team. So perhaps you can measure this as a form of success?
What I can say is that we are very fortunate to have been able to start with that clean sheet and a forward-thinking management team that appreciates the benefits of this process. This has been a guiding principle here at Smart from the outset and is one of the things that differentiates us from our nearest competitors.
Update [28/07/2020]: Since this article was written the code base has grown significantly. It now powers three identical platforms for Smart, Zurich, and the New Ireland Assurance (Bank of Ireland) each one with local settings and infrastructure, but deployed from the same CI/CD pipeline. To achieve this we have also had to grow the engineering headcount, to over 170, or 16 teams. The increase in engineers alongside the same codebase is noteworthy in three major areas.
1- Methodology. We have stuck to our principals, CI/CD remains the same, we just release more often each day. Between 25 and 30 times a day. To keep this ticking we have also increased the management and quality layer. More Engineering Managers have been hired as a direct multiple of teams being formed and we have also introduced Principal Engineers who’s job it is to be the guardian of code quality.
2- Domains vs cooks in the kitchen. Our stack is Ruby, with a very strong focus on Units (or small methods in an Object Orientated Programming pattern). This means we have easy to read code and therefore easy to edit and get into a PR code. It’s why new starters can deploy into prod a few days after joining. Units of code are grouped together into contexts. Payment, membership, investment etc, each area can be bounded together to be viewed as a domain. But we are a monolith, perfectly wrapped up in test coverage, but there comes a point where you reach critical mass in terms of people vs Master Branch. Our focus this year is to move to a Domain Driven Design (DDD) pattern, which allows us to scale areas of context in a code sense, but also in a teams and therefore people sense. Meaning more hires and therefore deploys per domain, whilst retaining the principals outlined above. Simply put, as the business scales so does the headcount, so does the deploy count around a specific domain, but without detriment to our methodology.
3- Culture. This final point is perhaps the most important. When Smart was formed I took pride in hiring and cultivating each new starter. Trust is everything in engineering, if your team trust you they will feel empowered to do great things. They will also protect what they have achieved, guarding it against any miss-steps that take the codebase away from its origins. As we grew, I couldn’t do this alone and had to share this responsibility with others, people I was fortunate to find were on the same wavelength as I. Martin Warner, our head of talent along with his amazing team know the heartbeat of our team and are fluent in complimenting our team with likeminded people. Brad Jayakody our head of engineering and I share an identical view on culture. A business can buy technical solutions, or third-party integrations but it cannot buy people who want to deliver at 110% and look after what they have strived to achieve. This is why a good culture is the most important spoke in the wheel of any engineering team that follows CI/CD, TDD, and other best practices today. People make all the difference, regardless of how good your automation is.
If you agree with this principle, read Brad’s article on the culture at Smart here.
Really enjoyed reading this Sam Barton and thanks for the insight. Ryan Sheldrake tagging you as think you'll enjoy this.
Financial Crime Detection and Prevention | HubSpot Top .01% {Strategic account-based marketing (ABM), demand generation, email, search engine marketing (SEM), paid social, inbound, outbound, field}
4 年Wow, I’ll read the piece by Brad Jayakody. Great update Sam Barton. Definitely resonates with what we find Code Climate, which we also documented. Sharing here as a point of validation of what you’ve outlined: https://codeclimate.com/blog/dogfooding-to-improve-developer-productivity/
Financial Crime Detection and Prevention | HubSpot Top .01% {Strategic account-based marketing (ABM), demand generation, email, search engine marketing (SEM), paid social, inbound, outbound, field}
4 年Thanks for sharing, Sam. It’d be interesting to me to hear an update on this around the effect that the practice has had one year on.
Ruby Engineer at Colossus Bets
5 年Great read Sam.. agreed that starting with a blank sheet makes getting the code quality right from the off but I would say that its such an amazing effort to have so many devs working on multiple branches to keep that quality. I know some devs who think TDD is slow and writing a test to then pass it with code seems backwards, but again this is a way to keep your code clean, and means truly building what’s needed well. Red, Green, refactor!