Green Light or Red Alert? How do your performance test targets look?

Green Light or Red Alert? How do your performance test targets look?

Writing test requirements can be a fairly easy process. We examine the business requirements and write out some form of criteria to check whether it's been met or not.The test itself is usually a black or white decision process. It either meets the requirement or it does not: Pass or fail. Simple! Well, maybe not quite that simple, (okay, it's a lot more complicated than that) but in general, that's the process.

Performance testing is no different. When things are done correctly there should be specific, measurable objectives for each of the performance requirements. One of these might start off looking something like:

  • Response time for transaction x must not exceed two seconds.

Which might look perfectly reasonable to an end user when they're asked to write some performance requirements. End-users generally think in terms of single-user response times. For them as a user, this is what they would expect from the system. From a performance perspective though, there are some pretty major gaps here. Performance testers think in terms of multi-user response times under differing loads. So, with a bit of coaching, we might get them to rework their requirement to be something like:

  • At 100% user and transaction load, transaction x response time must not exceed 2.00 seconds.

That's better. But, there's still room for improvement here. What does that two second response actually mean? Is this for all responses? Is it an average response time, is it the worst case out of ten? So, (let's not get into debating averages v percentiles here) let's take this to the next level and add some more performance-related criteria:

  • At 100% user and transaction load, transaction x response times must not exceed 2.00 seconds at the 90th percentile, with not more than a 1% failure rate.

Now we're finally getting somewhere. This is specific and measurable, and reduces potential ambiguity to something that has quantifiable performance test criteria all over it. Okay, admittedly there are a few more conditions we could throw in there, but by now, you should be able to see that if we were to test the system under load, we'd be able to see whether it passes the old binary pass/fail criteria and it would have better relevance to a performance test than the original requirement.

Unfortunately, though, in the world of performance testing, things are rarely as simple as black or white, pass or fail. Just like old-school photography, it's the myriad shades of grey that complete the picture and tell the whole story. A picture that is either pure black and white shows a lot less detail than one with shades of grey. It's how we capture and interpret those in-between shades that makes the difference.

I find the best and easiest way to interpret in-between conditions is using a traffic light system, (and this is where my black and white analogy fails) where green is good, amber means proceed with caution, and red is an obvious failure. You can use however many shades you like, but the concept should be the same if you use simple boundary conditions around where the colour changes.

Using the previous example, where the target transaction response time was 2.00 seconds, I might ask the business to rework their requirement along the lines of, say, if 9-out-of-10 meet the 2 second target then that can be considered good (Green), if only 8-out-of-10 achieve the target then it would be okay but not great (Amber), but anything less than 8-out-of-10 meeting the target or if there was an error rate greater than 1% then that would definitely be bad (Red).

So the original requirement hasn't changed:

  • At 100% user and transaction load, transaction x response times must not exceed 2.00 seconds at the 90th percentile, with not more than a 1% failure rate.

The requirement (90% must not exceed 2.00 seconds) is still the same, but how it is reported and interpreted will differ. I usually map this (and all the other performance test criteria) in my performance test plan something like this:

I find that using this kind of Green/Amber/Red reporting criteria for measurable performance test criteria helps enormously when results are presented back. Simply reporting a list of passed/failed test criteria often adds little value to projects where performance test results are subject to interpretation.

For example results presented as just pass or fail like this:

is a lot harder to interpret than the same data presented in traffic light form:

Both tables report exactly the same result set.

Using the traffic light approach really helps focus on results that are definitely bad and cannot be interpreted as anything but. Amber areas with occasional red highlights can paint a very different picture to what otherwise might appear to be a sea of red. I find this is much easier to focus in on what are true performance issues without the distraction of things that are borderline and (usually argued to be) acceptable.

Let me know what you think. Do you use the traffic light system or something else? What works best for you?

要查看或添加评论,请登录

Chris Jones的更多文章

  • Performance is not horsepower alone

    Performance is not horsepower alone

    I’m always banging on about performance being fit-for-purpose rather than just outright speed. At AccessHQ, we’re…

    3 条评论
  • Why do we never learn in IT?

    Why do we never learn in IT?

    I honestly can’t think of any other major industry which consistently over-spends, under-delivers and repeats the same…

  • Think you'd make a good Performance Tester?

    Think you'd make a good Performance Tester?

    “Some men aren't looking for anything logical, like money. They can't be bought, bullied, reasoned, or negotiated with.

  • Performance Is Not Just Speed

    Performance Is Not Just Speed

    A lot of people think performance in IT systems is all about speed. Questions like; “How fast does it go?” and “What’s…

    2 条评论
  • Performance Testing - are you peering into darkness?

    Performance Testing - are you peering into darkness?

    Within big IT projects, often performance tests run with few infrastructure monitors in place, if at all. It's not…

    2 条评论
  • Performance Testing Averages, 90th percentiles or Avg-90%?

    Performance Testing Averages, 90th percentiles or Avg-90%?

    As a performance tester, my role is to ensure my clients clearly understand how their systems perform under load. To…

  • Performance is not just reliability and availability

    Performance is not just reliability and availability

    There's no truer maxims in the world of complex IT systems than Murphy's Law, and its corollary Finagle's Law. These…

  • Why Testing is like Book Publishing

    Why Testing is like Book Publishing

    Testing software is a complex and difficult thing. There are so many opportunities for issues to arise - from major…

  • Performance Testing Cheatsheet - Diagnosing Server Congestion

    Performance Testing Cheatsheet - Diagnosing Server Congestion

    When faced with diagnosing performance issues with windows-based servers (especially when perfmon stats are easily…

    1 条评论
  • Open All Hours. 23? x 7; 357 Days a Year

    Open All Hours. 23? x 7; 357 Days a Year

    In today's online market, to coin a phrase, time is money. So, availability and performance are king, right? Well…

社区洞察

其他会员也浏览了