Unclog your software delivery pipelines
Jason Baker
Leading the delivery of modern cloud services which are essential to the health of our democracy.
Software testing is typically the biggest bottleneck in any continuous delivery process. High performing software organizations invest significant effort in making sure testing activities are highly automated and repeatable. Traditional software organizations are oftentimes encumbered by labor-intensive manual testing which is non-deterministic and error-prone. The reality is that it is not possible for an organization to become a high performer without addressing software testing challenges.
The two key software testing challenges most companies face are building a performant testing process and making the testing process resilient. Testing must be performant because it has a direct impact on software delivery velocity and incident remediation. If you want to create a fast software delivery pipeline which supports a continuous flow of changes, you can't introduce a significant blockage in the pipeline. Many traditional organizations require a full day to execute an entire test suite, and several days isn't that uncommon in my experience. Testing, based on a cost of delay metric, becomes one of the most expensive parts of the entire software development process.
Why are traditional software testing processes so slow? These processes are slow due to a lack of test automation and a lack of investment in an efficient testing strategy. All high performing software delivery organizations utilize automated testing processes in their software pipelines which require no human intervention. In other words, engineers aren't required to initiate software tests or review the results. Tests run automatically as part of the promotion of code through the software delivery pipeline.
Traditional software organizations incur a large testing performance penalty due to the long wait times associated with software tests. Someone has to "push the button" to alert testing team members that testing is needed. Each testing team member has to schedule time to initiate testing for their particular platform domain. They may need to run test suites multiple times to successfully execute all of the tests. Some tests must be manually performed because the work required to automate the testing is incomplete. Finally, test team members must manually review test reports to identify the root cause of test failures.
A lack of automation around software testing is a clear form of technical debt, and it's a type of debt which high performing software organizations find unacceptable. They require teams to supply automated integration and functional tests along with new feature code. Again, this is because a lack of automating testing impacts the flow of service delivery. Manual and unreliable tests add friction to software delivery pipelines, impeding the flow of change like a drainpipe which accumulates debris over time.
It's a mistake to assume that simply automating all the tests will lead to performant software delivery pipelines. The type of tests utilized in the delivery pipeline can make a critical impact. Many traditional software organizations make the mistake of focusing their testing efforts on writing functional tests leveraging a web user interface, using something like a Selenium framework. This type of testing is expensive to create, time consuming to execute, and challenging to maintain because it's inherently fragile. Any minor change in the UI design or timing change in an asynchronous process can wreck havoc with UI-based functional tests.
Focusing testing efforts on UI functional tests eventually leads to the infamous ice-cream cone testing anti-pattern. Basically, companies suffering from this anti-pattern end up with hundreds of unit tests and thousands of functional tests. While the unit tests take seconds or minutes to run, the functional tests take hours or days. This sort of testing imbalance becomes difficult and expensive to correct and a real constraint to organizations hoping to adopt a continuous delivery strategy.
Using a UI testing framework to validate business logic is like taking a car engine off the production line and mounting it in a car to test it. Can you imagine how expensive this sort of testing process would be for a car manufacturer? Engine manufacturers build test interfaces into engines so that car manufacturers can attach a test harness to the engine while it's still on the line. In effect, manufacturers utilize specially-built test interfaces to validate components to speed up manufacturing work.
High performing software organizations prioritize unit tests above all other forms of testing. Unit tests are narrowly-scoped and extremely performant. They also provide software developers with fast feedback, aligning well with Lean's shift-left quality assurance philosophy. High performing organizations minimize the use of UI testing frameworks, instead leveraging API and other middleware-based testing tooling. The advantage of API testing is that it's easily scripted, it's generally very performant because it eliminates the overhead of an application client, and it allows teams to validate business logic and API contracts. In many cases these organizations build special API endpoints specifically for testing the software applications -- kind of like the test interface on a car engine.
An efficient testing strategy ultimately looks like thousands of unit tests, hundreds of API/functional tests, and dozens of UI tests. The organization can execute an entire testing suite in well under an hour. This sort of strategy allows the organization to rapidly promote new features and address critical production defects in a timely manner. It does so without sacrificing quality or incurring more risk.
Non-deterministic tests are another significant risk to any organization attempting to implement a continuous delivery process. What is a non-deterministic test? It's a test which intermittently fails for no obvious reason. The problem organizations face when trying to evaluate these tests is that they cannot be certain if the test result is a false positive or a false negative. In other words, they can derive no information or value from the test -- only risk. If the result is a false positive, they might inadvertently introduce a serious defect into production if the test result is ignored and code is promoted. If the result is a false negative, the organization just wasted time investigating the cause of the failure and slowed down the software delivery process.
Imagine you were a software engineer working on the Boeing 737 Max flight control software and some of your test results were non-deterministic. Would you want your family to fly on a plane running that software code? Likely not. What if you were a patient in a hospital. Would you want your doctors and nurses to communicate with software which produced non-deterministic test results?
Traditional software organizations try to provide stakeholders assurance by hiding non-deterministic tests in reports showing that an arbitrary number of software tests passed. For example, their test reports will show that 90% of the software tests passed but say nothing about the tests which failed. Even if 99% of the tests passed, the 1% which didn't could represent the most important functionality in the software platform. Allowing software with failing tests to be handed to customers requires the organization to make a judgement call. Humans have to decide whether or not those failing test results were meaningful, and unfortunately humans make mistakes.
High performing software companies accept nothing less than 100% passing test results in a continuous delivery pipeline. It's not possible to automate away non-deterministic tests so these types of tests must be removed from the testing process. Many organizations implement a two-stage process where non-deterministic tests are first moved to a quarantine test suite which is no longer executed as part of the automated pipeline. The tests in the quarantine suite are carefully analyzed to see if the tests can be repaired, and quarantined tests which cannot be repaired are quickly discarded.
My experience is that traditional software organizations struggle to implement continuous delivery without making significant changes in their testing strategy. The organization must focus on educating team members on the numerous performance penalties associated with UI-based automated testing. They must be willing to abandon tests which produce non-deterministic results regardless of the level of personal investment made in creating those tests. And at the end of the day, the success of a new testing strategy ultimately depends on the organization's level of commitment for fast and continuous software delivery.
Senior DevOps Engineer
4 年I recall having this conversation with you. Excellent post!