Performance Testing - are you peering into darkness?
Within big IT projects, often performance tests run with few infrastructure monitors in place, if at all. It's not unusual for tests to rely solely on transaction response times to determine system performance. There's a familiar mantra of not enough time, security issues, no test accounts, budgetary constraints, and limited monitor availability to enable appropriate infrastructure monitoring.
Never mind though, the project must go on! The test measures application speed with end-to-end "transactions" and, without monitors it should still be clear where performance issues lie.
When these tests find some transactions perform worse than others, or don't scale well, Test Managers wail, Project Managers yell, and dev/support staff are poked and cajoled until they tweak and rejig things sufficiently to improve these end-to-end transaction times. They look into and fix up inefficient code or queries where they can, change configuration options from their default values and the project goes live. KPIs met, champagne corks pop, bonuses are paid into offshore accounts and everyone's happy. Yay.
Imagine though, instead of an IT project, we were to take that same approach to performance test real-world things; say public-transport commutes to work. We would measure door-to-door time from home to the office for a few different staff members. With enough samples, we should be able to work out which routes perform worst, and what conditions affect them.
Peak hour might be an issue for some and extreme circumstances (train driver strike or something) for others might see commuters "time out" and give up before completing their journey.
So the "test" side of things kind of works, just like the IT project, but to determine how to improve performance we need to understand things in more detail than simply end-to-end response times.
An IT-like response to tweak and fine-tune poor performing routes might be to improve commuter efficiency (get them to wear running shoes, leave their bag at home and run to the train station), tune or add to the hardware (install high-speed train tracks) and maybe streamline some of the process (don't pick up coffee on the way).
But without monitoring these processes along the way and their additional information, we'd have little idea what the biggest contributing factors are and we'd fumble in the dark with educated guesses.
Some monitors to might help might be:
- A heart monitor. This might suggest our commuters are already operating near capacity, so running to the station might prove problematic, or even catastrophic for some.
- A train speed monitor. This might show top speed is never reached on the poor performing routes, and the train often sits idle, waiting for a clear platform or track ahead, so high-speed tracks aren't warranted.
- Coffee-shop queue monitoring. This might show it takes a minimal time to complete the coffee pickup in the grand scheme of things and doesn't contribute to poor performance under load for these commuters. Almost every commuter does this and most get to work on time.
The more monitors and more specific information we have about each step - infrastructure performance under load, capacity and usage levels, etc. - the better informed we are, and able to determine the true performance impactors on the journey to work.
Back in the IT world, this of course relates to system monitors for infrastructure components. Although they may not be made available to the performance test itself, they are often already in place on shared components. Their information isn't correlated to performance test events and analysed effectively or comprehensively. A traditional silo mentality prevails across hardware, networks, database, applications and test and production domains. This is generally where the failing is; often the right information is being monitored but it isn’t made available for analysis and synthesis with the performance test itself, to be able to correlate cause and effect across all of the supporting infrastructure.
Performance Testing without monitoring relies on extensive deductive reasoning - sometimes theories and remediation actions are spot on. Sometimes they're not. It requires inspired subject matter experts, more work than it really should and often, a large slice of luck.
There are specialist tools available to help collate, manage and analyse large amounts of disparate monitored data, but these can only be used if the information is available and shared
It’s imperative infrastructure monitors are turned on and the results and analysis tools are available to the test effort. Otherwise we’re all just peering into darkness.
#performance #testing #monitoring #load #infrastructure #NFT #dark #analysis #end-to-end #performancetesting #analysis
National CTO Office - Lead Architect
6 年Great article Chris.
Very good article! Shared with my network in Facebook and Linkedin.