Load Testing in the Fog of WAR
Executing Load, Performance and Stress tests can be a challenge, especially when there is significant pressure to go-live while testing is constantly uncovering serious problems. We all want to see the application we are testing 'take off' - with plenty of 'runway' to spare, but who wants to accelerate down a runway when it is shrouded in fog and the amount of remaining runway is not even visible.
Fog is a useful metaphor during the final stages of testing a critical enterprise application, especially when the overwhelming focus of management attention revolves around the question of "When will the Application be Ready". Fog both obscures the visibility of what a Performance Tester needs to see, and at the same time it obscures the visibility of what management needs to see. In the context of this post, "fog" represents the uncertainty while preparing for the Go/No-Go decision and how one should best proceed in the midst of the Fog of War.
When some load tests are passing but others are failing, visibility of the key capabilities of the System Under Test is often limited. I call this the "Can Work" vs "Will Work" problem. If testing demonstrates that the system CAN meet the performance requirements (in some tests) but repeatedly fails the 'Endurance Test' for various different reasons, then we don't really know if it "WILL WORK" for prolonged periods of time in a production scenario or if we are simply experiencing issues related to the test environment. Those wanting to Go-Live will take the good news (CAN work) and the more risk averse will not be happy until testing clearly demonstrates that the application WILL work.
Unfortunately, when a project is beset with serious performance or stability problems, even simplified load tests can fail, and it is clear that the application "CAN NOT" operate as required. Depending on the management approach, this can sometimes result in demands for more planning, more testing and more reporting. However, it is likely that such actions will be counterproductive, as they are directed at the validation process rather than the areas that are empowered to resolve the underlying the technical problems. Running the same test multiple times in the hope that one test run may produce a 'pass' is illogical.
As frequent flyers will know, fog sometimes appears unexpectedly at sunrise, and throws airline flight schedules into turmoil. However, if the fog exists and is very thick well before sunrise, then it will probably persist for a long time after sunrise, but still messes with airline schedules. It is a little like 'uncertainty' that becomes obvious in a project well before testing even commences. If the Load Testing landscape is foggy before testing starts, it will probably remain foggy throughout the entire testing process.
If, for example, a complex enterprise system has incomplete requirements, partially delivered application code, poor provision for a realistic test environment and a very aggressive schedule, then all of these factors will tend to conspire to multiply the degree of project uncertainty, making for a very 'foggy' project. The number of lurking 'unknown unknowns' is probably high and the probability of multiple slips in the 'go-live' date is also high. It is a little like an airline that approves a flight takeoff at one airport knowing that the destination is very foggy. The airline seems to 'hope for the best', mitigates some of the risk by loading up a couple of extra hours of fuel, burns the fuel while circling near it's destination but still gets diverted to an alternate destination, or even worse the origin!
So, what is the best course of action when one realises that a Load Testing project is suddenly 'fogged in'? I suggest the following:
- Switch from 'Formal' test execution mode back to 'shakeout' mode. The easiest way to do this is to highlight that the 'entry criteria' for load testing are not currently satisfied - even though they may have appeared to be satisfied at an earlier point in time.
- Simplify the tests to reduce script maintenance overhead and reduce total execution time. For example, just keep the 'main happy path' and setup a test scenario for that test so that it can be executed within an hour. (rather than multiple variants of the 'happy path' and a test duration of several hours) Complexity and longer test runs can be progressively added as the system becomes more stable.
- Consider running tests with lower targets, to quantify the effective capacity of the system. Identifying that the solution can deliver 70% of the target load sends a very different message to a result showing it can only deliver 20%.
- 'Slice and dice' the tests to target very narrow components of the application or infrastructure, to 'screen' for problem areas to focus on.
- Generate a simple Test Run Report for each and every run, that articulates the changes in the application, the environment since and the results and any significant observations that may give clues to the root cause of problems. Attempt to release such a report within a couple of hours of completing the test, and make the report widely available.
- Temporally abandon the original test plan, and re-plan each day's tests on a day by day basis, based on the highest value tests that can be run in the current situation with a view to identifying root cause of the most serious issues and validating solutions as they are implemented. And don't schedule tests for the sake of filling the day with test runs. Only plan a test if it will add value to the problem resolution process.
- Consider adding special internal 'instrumentation' to test scripts, to provide better insight into problems. For example, rather than flagging a transaction has failed (which is normal for a load test script), add logging so that all the key values relating to a particular transaction before a the problem step are recorded and post test analysis. Such information may reveal a key pattern for failures.
- Consider recommending a formal and methodical Tuning Cycle if the problems appear to be the sort of problems that can be resolved by tuning. Tuning may involve running test scripts and scenarios, but is very different to testing, as it has a different focus, different drivers and a different methodology.
- If the situation is really bad, and various groups are actively resolving problems, but fixes are several days or weeks away, then consider recommending a 'testing holiday'. This can reduce the 'noise' that can distract key resources.
- Work with relevant stakeholders to determine revised entry criteria to formal load testing, so that the original tests can be executed. (and possibly re-assess the merits of the original plan and update as required)
- Consider proposing a management meeting on a regular basis, where relevant managers can assess the priority for problem resolution and have the authority to enable re-allocation of resources to address the most pressing issues and remove any 'road blocks' that are frustrating the resolution of problems.
While some projects are 'foggy' because of poor processes and organisational immaturity, some are the result of a special CEO directive. The CEO's "Special Project" tends to crash through processes that were setup to ensure project success.
While those operating in the project domain 'Waiting for the Application to be Ready' experience significant 'battle damage', the CEO is fighting a fight at a whole different level. In the midst of this battle, the CEO is not particularly concerned about the 'precious processes' that are getting stretched out of shape - the CEO is concerned about positioning the organisation for success. I have been surprised to see such applications released into production, even after failing most of the NFRs - only to see that the CEO knew that the 'shaky' application delivered something much bigger than response time and availability.
As military analyst Carl von Clausewitz said in his translated work "On War":
War is the realm of uncertainty; three quarters of the factors on which action in war is based are wrapped in a fog of greater or lesser uncertainty. A sensitive and discriminating judgment is called for; a skilled intelligence to scent out the truth.
And that is an interesting summary of the purpose of Load Testing - to 'scent out the truth' of the capability of an application, even when the key factors are 'wrapped in a fog of greater or lesser uncertainty'.
CBAP? ° CSM? ° ITSM ☆ Enterprise Business Analyst ° Business Architect ° Experienced Board Director
9 å¹´Well written. An issue I've seen over and over in often time crunched pre-release planning scedules and test execution.