DevOps for Large Programs - It is Not How the Articles Say
Cliff Berg
Co-Founder and Managing Partner, Agile 2 Academy; Executive level Agile and DevOps advisor and consultant; Lead author of Agile 2: The Next Iteration of Agile
Pretty much every article that one reads about DevOps and continuous delivery (CD) talks about having a “pipeline”. A DevOps pipeline is a set of automated tests, orchestrated by a tool such as Jenkins or VSTS. While a pipeline is not actually central to DevOps (see my article on that, Continuous Delivery is Not a Pipeline), it is important. But,
What is almost always missing in large programs that claim to use continuous delivery is a “pipeline of pipelines” - a set of orchestrated integration tests that pull together all of the pieces of an entire system and test them together, end-to-end. Ideally, this should be done on a frequent basis, such as nightly.
What they actually do is team level continuous integration and program level waterfall.
Instead, consider what I see most often in the organizations that I visit: If they claim to do “DevOps”, what they actually do is team level continuous integration and program level waterfall.
A typical large enterprise IT program has a set of teams - often up to twenty in a single program or initiative. Some of the teams exist to support a particular component or subsystem, while other teams work across components - these are “feature teams”. Say there are ten major components or subsystems: there are then at least ten pipelines - one for each component.
That is all well and good; but most organizations use a waterfall era process for integrating the many components: they have an “integration phase” during which they “go into QA”. In the QA environment, each team carefully deploys its component(s), and then testers poke around, trying things out. In contrast, a true DevOps organization will have an automated integration test suite, for deploying all of the system’s components and testing them together. Such a test suite can be run on a regular basis, providing valuable feedback throughout development on integration problems.
Having only isolated pipelines is a form of cargo cult DevOps. It is not too unlike Bob in that movie What About Bob, tied to the mast of a sailing ship, shouting to his therapist, “I sail - I’m a sailor!” Imagine him clinging to his pipeline, shouting, “I do CD!”
Well, maybe that is too harsh. In fact, if you have team pipelines, you have taken a huge step, and it is not too big a step to go to the next level, of having automated integration tests, which will make true continuous delivery possible. To do it requires leadership. I likely will not happen on its own; someone with authority usually needs to organize the discussions and decision-making needed to make it happen.
Too often I see that a-lot of effort has gone into coordinating the definition of business level epics, features, and stories, and apportioning those to teams, but not enough effort into technical coordination. The Agile mantra of “the team knows best” and the belief that the team will “self organize” seems to get in the way of providing the technical leadership that is needed to organize around needed technical practices such as automated integration testing.
One of the arguments against integration testing is that if one thoroughly tests one’s component level contracts, then integration testing is superfluous. However, I have never seen that work. Component contract testing is essential, but so is integration testing. Even the best component contracts do not define all of the assumptions that go into code. In addition, today’s architectures use very granular components - microservices and events - and so much of the system’s architecture is in the “outer architecture” - outside of the code, and therefore unspecified by contracts. Much of the behavior is time-dependent, with events flowing in certain sequences. Contracts do not define temporal behavior. It must be tested, and all of the failure scenarios - with events out of sequence or undelivered - must be tested as well. That’s integration testing.
Besides integration testing, another important practice of effective continuous delivery is having a way to assess or ensure enough coverage for each category of tests. For unit tests, pipelines routinely measure code coverage. However, while unit tests may be sufficient for a simple Web application, they are not sufficient for a complex system: one needs integration tests at each level of the system hierarchy. For each of those test suites, there needs to be a way to ensure that the test suite is comprehensive enough. (See my four-part article series Real Agile Testing in Large Organizations.)
I just had a long conversation about this with one of the developers of the Hygieia dashboard tool. The tool has lots of features for a single pipeline, but it is missing features for the next level - the integration level. Perhaps they will add those next. I hope they do - it is a great tool. It is hard though, because there are not clear-cut metrics for test coverage at the functional/ behavioral level - there is no analog of “code coverage” that one can use for integration tests, unless one measures code coverage for those (which one can), but high code coverage does not prove that your test suite covers all of the requirements.
What I have done with many teams is made sure that there is an experienced Agile-minded test lead on each team, and at the program level as well. Those individuals review the gherkin test specs that get written, and identify additional test scenarios to cover edge cases. This provides a level of quality control that gets the functional coverage up to a sufficient level. It does not yield a metric, but it does manage risk - which is the ultimate goal.
Beware of oversimplification of continuous delivery. A colleague of mine calls this “Powerpoint DevOps” - practices explained with nice slides but without real world nuance. For effective continuous delivery, the pipelines follow the architecture, with test suites at each major interface, and the test strategies are tailored for what is being built. The tech leads collaborate in an orchestrated manner to continually refine their architecture and their development, deployment, and test practices, using dashboard metrics such as feature level lead time, production defect rate, and integration test cadence to radiate their progress. It is not left “to the teams” to self organize on these critical issues - discussion and retrospective on technical practices are continually facilitated by technical leadership, rather than stepping back and hoping that it will happen.