Key Mathematic Principles for Performance Testers

Key Mathematic Principles for Performance Testers

Members of software development teams, developers, testers, administrators, and managers alike need to know how to apply mathematics and interpret statistical data in order to do their jobs effectively. Performance analysis and reporting are particularly math-intensive. This chapter describes the most commonly used, misapplied, and misunderstood mathematical and statistical concepts in performance testing, in a way that will benefit any member of the team.

Even though there is a need to understand many mathematical and statistical concepts, many software developers, testers, and managers either do not have strong backgrounds in or do not enjoy mathematics and statistics. This leads to significant misrepresentations and misinterpretation of performance-testing results.


Exemplar Data Sets

This chapter refers to three exemplar data sets for the purposes of illustration, namely.

  • Data Set A
  • Data Set B
  • Data Set C


Data Sets Summary

The following is a summary of Data Sets A, B, and C.


Data Sets

Averages

An average ― also known as an arithmetic mean, or mean for short ― is probably the most commonly used, and most commonly misunderstood, statistic of all. To calculate an average, you simply add up all the numbers and divide the sum by the quantity of numbers you just added. What seems to confound many people the most when it comes to performance testing is that, in this example, Data Sets A, B, and C each have an average of exactly 4. In terms of application response times, these sets of data have extremely different meanings. Given a response time goal of 5 seconds, looking at only the average of these sets, all three seem to meet the goal. Looking at the data, however, shows that none of the data sets is composed only of data that meets the goal, and that Data Set B probably demonstrates some kind of performance anomaly. Use caution when using averages to discuss response times and, if at all possible, avoid using averages as the only reported statistic. When reporting averages, it is a good idea to include the sample size, minimum value, maximum value, and standard deviation for the data set.

Percentiles

A percentile is a straightforward concept that is easier to demonstrate than define. For example, to find the 95th percentile value for a data set consisting of 100 page-response-time measurements, you would sort the measurements from largest to smallest and then count down six data points from the largest. The 6th data point value represents the 95th percentile of those measurements. For the purposes of response times, this statistic is read “95 percent of the simulated users experienced a response time of [the 6th-slowest value] or less for this test scenario.”

It is important to note that percentile statistics can only stand alone when used to represent data that is uniformly or normally distributed with an acceptable number of outliers (see “Statistical Outliers” below). To illustrate this point, consider the exemplar data sets. The 95th percentile of Data Set B is 16 seconds. Obviously, this does not give the impression of achieving the 5-second response time goal. Interestingly, this can be misleading as well because the 80th percentile value of Data Set B is 1 second. With a response time goal of 5 seconds, it is likely unacceptable to have any response times of 16 seconds, so in this case neither of these percentile values represent the data in a manner that is useful to summarizing response time.

Data Set A is a normally distributed data set that has a 95th percentile value of 6 seconds, an 85th percentile value of 5 seconds, and a maximum value of 7 seconds. In this case, reporting either the 85th or 95th percentile values represents the data in a manner where the assumptions a stakeholder is likely to make about the data are likely to be appropriate to the data.

Medians

A median is simply the middle value in a data set when sequenced from lowest to highest. In cases where there is an even number of data points and the two center values are not the same, some disciplines suggest that the median is the average of the two center data points, while others suggest choosing the value closer to the average of the entire set of data. In the case of the exemplar data sets, Data Sets A and B have median values of 4, and Data Set C has a median value of 1.

Normal Values

A normal value is the single value that occurs most often in a data set. Data Set A has a normal value of 4, Data Set B has a normal value of 3, and Data Set C has a normal value of 1.

Standard Deviations

By definition, one standard deviation is the amount of variance within a set of measurements that encompasses approximately the top 68 percent of all measurements in the data set; in other words, knowing the standard deviation of your data set tells you how densely the data points are clustered around the mean. Simply put, the smaller the standard deviation, the more consistent the data. To illustrate, the standard deviation of Data Set A is approximately 1.5, the standard deviation of Data Set B is approximately 6.0, and the standard deviation of Data Set C is approximately 2.6.

A common rule in this case is: “Data with a standard deviation greater than half of its mean should be treated as suspect. If the data is accurate, the phenomenon the data represents is not displaying a normal distribution pattern.” Applying this rule, Data Set A is likely to be a reasonable example of a normal distribution; Data Set B may or may not be a reasonable representation of a normal distribution; and Data Set C is undoubtedly not a reasonable representation of a normal distribution.

Uniform Distributions

Uniform distributions ― sometimes known as linear distributions ― represent a collection of data that is roughly equivalent to a set of random numbers evenly spaced between the upper and lower bounds. In a uniform distribution, every number in the data set is represented approximately the same number of times. Uniform distributions are frequently used when modeling user delays, but are not common in response time results data. In fact, uniformly distributed results in response time data may be an indication of suspect results.


Figure 2

Normal Distributions

Also known as bell curves, normal distributions are data sets whose member data are weighted toward the center (or median value). When graphed, the shape of the “bell” of normally distributed data can vary from tall and narrow to short and squat, depending on the standard deviation of the data set. The smaller the standard deviation, the taller and more narrow the “bell.” Statistically speaking, most measurements of human variance result in data sets that are normally distributed. As it turns out, end-user response times for Web applications are also frequently normally distributed.


Figure 3

Additional Considerations

In order for results to be consolidated, both the test and the test environment must be identical, and the test results must be statistically equivalent. One approach to determining if results are similar enough to be consolidated is to compare results from at least five test executions and apply the following rules:

  • If more than 20 percent (or one out of five) of the test execution results appear not to be similar to the rest, something is generally wrong with the test environment, the application, or the test itself.
  • If a 95th percentile value for any test execution is greater than the maximum or less than the minimum value for any of the other test executions, it is not statistically similar.
  • If every page/timer result in a test execution is noticeably higher or lower on the chart than the results of all the rest of the test executions, it is not statistically similar.
  • If a single page/timer result in a test execution is noticeably higher or lower on the chart than all the rest of the test execution results, but the results for all the rest of the pages/timers in that test execution are not, the test executions are probably statistically similar.

Conclusion

Members of software development teams, developers, testers, administrators, and managers alike need to know how to apply mathematics and interpret statistical data in order to do their jobs effectively. Performance analysis and reporting are particularly math-intensive. It is critical that mathematical and statistical concepts in performance testing be understood so that correct performance-testing analysis and reporting can be done.


Follow For More !! Thanks


要查看或添加评论,请登录

Rojan Uprety的更多文章

社区洞察

其他会员也浏览了