To retry or not, is the question
Flaky tests are a big problem for an affective continuous integration setup. We want our CI to be running all tests and continuously providing feedback about changes going in to main. But flaky tests disrupt that flow and make developers question the usefulness of CI.
The worst outcome for this situation is to let CI ‘pass’ and allow failures.
allow_failure: true
It's a slippery slope and developers won’t know whether tests are failing/passing/flaky and they start ignoring tests all together. A slightly bad outcome is the mindset of a ‘manual’ retry. i.e. the first instinct of a developer when they see a failure in the pipeline is the hit the retry button instead of figuring out the error.
The focus of this article is an automatic retry i.e when a pipeline fails when it is ok to programmatically retry. We want to test to truly fail a pipeline only if there is truly a problem but not for external issues.
Criteria for flakiness
Tests can be flaky for a number of reasons but broadly speaking they can broken to three aspects.
领英推荐
Retrying failed tests can be employed in various ways and it depends on the criteria above. Be mindful of tradeoffs between test complexity, code base maintenance (legacy vs active) and the impact of a flaky pipeline failure.
Recommendations
Conclusion
Just like many other software development practices, retrying tests should be based on evaluating tradeoffs and principles. Don’t be blinded with blanket arguments that retry is always a bad pattern.