登录查看更多内容

How to get a better understanding of your performance test results

Joey Hendricks

Principal Performance Engineer at Algemene Pensioen Groep (APG)

发布日期: 2020年7月15日

In my career as a performance engineer, one thing I learned has resonated the hardest with me. That is that we, as performance engineers, should not be looking at results that are aggregated over time or are made up out of averages. To understand your test results truly, we must look at our raw data.

Within this article, I will dive deeper into what raw data is and why it is so essential. I will also discuss the downsides of averages and aggregated test results and will share with you how you can get an export of the raw data out of Neoload and how you should visualize this raw data export. Please read along and share your thoughts and insights in the comments. Your feedback is much appreciated.

What is raw data?

The term raw data within performance engineering is not always understood by everybody and can be considered a bit vague for some people. But what we mean with raw data is that it contains every single collected performance measurement. Usually, these type of test results can be exported from your load test tool as a .csv file. Depending on the throughput of your performance test, these raw data test results can be substantial; they might range from being 1 MB to 100 MB or higher.

So that’s good and all but why is raw data so important for us performance engineers? Well, when you are trying to solve a mystery, you are going to need all of the clues to solve it or else it will just leave you guessing. The same is with performance problems to figure out your performance mystery; you will need all the data you can get your hands on to help you crack the case and become the hero.

That’s why raw data is so crucial! Without gaining access to the whole picture, we cannot make any accurate conclusion. That means collecting and looking at as many response time measurements and system recourse metrics as we can and to store them in their raw format for as long as possible.

Can we trust averages?

For me to solve a performance mystery, I need all the evidence; sadly, many tools only offer graphs based on average response time; this is great for getting a rough understanding of how your test performed or to share results with stakeholders.

But averages can be confusing and misleading when taken for face value, and they can lead many performance engineers down a rabbit hole, by hiding outliers and response time patterns that could otherwise hint towards performance issues. Averages deny you the actual reality of how the application you are testing is honestly behaving. The best asset to show the impact of averages would be the animation below:

In the first part of the above animation, we can see a graph showing the average response time over time. This type of figure is widely seen in the analytic section within many load testing tools. If we are only presented this figure to base our conclusions on, the assumption is likely to be made that the system is looking quite stable.

But when the animation switches to the raw data view of the same test, we see entirely different behaviour in the response times that would lead us to conclude that the system is not as stable as the averages graph would want us to believe. By looking at our raw data, we can look past the lies of averages and unmask the actual culprit behind our performance mysteries.

Only by being able to get a complete and comprehensive view of how our system is performing, we can solve our performance issues. It does not mean that averages are useless, not at all! Averages do a fantastic job reflecting the performance of your test in general, so they are perfect for sharing with stakeholders. But to solely base your analysis on figures based on averages is a too simplistic outlook.

Why aggregation over performance metrics makes little sense

There is no denying that one of the reasons that people choose to aggregate performance metrics within a load test tool or an APM solution, is to combat the fact that raw data test results can take up a massive amount of space. Because your test results are significantly larger in their raw format, they quickly can slowdown test results loading times and make it harder to visualize your test results.

By aggregating our test results into buckets of minutes, hours, days, weeks or even months, we are losing the ability to look at our results on a more granular level. Therefore that aggregated test results sadly succumb to the same pitfalls as averages by hiding outliers and patterns that would otherwise be incredibly useful for us. On top of that, aggregated results also make it harder to generate meaningful statistics like, for example, percentiles.

When you’re feeding, let say percentiles aggregated results, many measurements have already been filtered out. To generate genuinely meaningful statistics like percentiles, you are going to have to calculate them over your raw data or else you run the risk of calculating your metrics over a smaller sample size.

Without raw data, aggregation can lead you to make poorly informed decisions and report the wrong metrics. Therefore, I believe that the value of using raw data far outweighs the challenges that considerably larger test results might bring to the table. Aggregation makes it harder for us to see what during the moment we had a performance problem or when we were running our performance test. This, in turn, will lead us not to see the entire picture and will mislead us on or quest to solve our performance issues.

How to get raw data out of Neoload

Hopefully, by now, I have you convinced why it is imperative to use the raw data format of your test results. Within this article, I will also guide you through how to collect raw data from Neoload.

To do this, you need to have a basic understanding of how Neoload records raw data. Namely, because Neoload only records raw data on the transactional level. Therefore when scripting your performance test, you need to keep in mind that everything you wish to measure needs to be within a transaction. Within a Neoload design, this would look something like this:

Therefore, it is essential that your test is designed with the above image in mind to make sure that a transaction always encapsulates the requests you want to measure. So how can you export the raw data from Neoload? Well, within Neoload, it is quite easy to do this! In the animation below, I take you through the steps needed to export your test results in the raw data format:

Now that you know how to export your raw data manually, it is helpful to know that it is also possible to run your test through the console and still export your raw data. To do this, you will need to fire off a console command. The basic command would look like the following:

NeoLoadCmd.exe -project path/neoload_script SM -noGUI -exportRaw path/export.csv

The above command could be used on a windows machine. Still, you might need to add some additional parameters and more information on this can be found within the Neoload documentation linked here.

How to visualize raw data with powerful BI tools

Now that you know how to export a CSV file out of Neoload containing all of your measurements that you collected during your performance test. You might be already thinking to yourself what am going to do with just a flat CSV file? Well, you are right. We cannot do much with only a flat CSV file as we are going to need a solution which will help us to create custom graphs quickly and allow us to play around with our raw data in a way that we can promptly gain valuable insights.

The solution I use for this is Tableau. If you are unfamiliar with this tool, don’t worry its an unconventional tool within a performance engineers toolbox, so a quick explanation of Tableau is more than deserved. Tableau is one of the world's leading business intelligence solutions that allow their users to create powerful and valuable graphs and insights into their data with minimal effort.

To show you what is possible with Tableau, I have made the animation below to quickly show you how you can make a simple scatter plot graph within Tableau.

Sadly, Tableau is not a freeware solution, and a fee is required before one can start using Tableau. But in my experience, the price point of Tableau has never been much of a problem because the value that Tableau brings far outweighs the costs.

To get a taste of Tableau, I would highly recommend using their free 14-day trial and try it out yourself. If you need any help in getting started with Tableau, you can find a plethora of videos guiding you on how to start using Tableau on their website. However, Tableau is not the only BI tool on the market and it is very well possible to create the same benefits with solutions from other vendors.

Other alternatives might also be to look into using Pythons Matplotlib library to create your graphs out of your raw data, or you could use the R programming language to visualize your raw data test results. There are numerous ways you could achieve the same outcome which is possible to achieve with fancy and somewhat expensive BI tools. It is simply a matter of getting creative with the options you have at your disposal.

No Silver bullet

With raw data, we can see a more considerable amount of measurement in our graphs; this enables us to dive deeper into our test results and provide a more comprehensive analysis. But in turn, all of this information can also be very overwhelming for us. With proper tooling, we can guard our self against this, but this does not entirely eliminate the risk of this happening.

Scatter plot graphs based on raw data are great for developers and for us to understand how the system is performing. But for stakeholder, these kinds of figures are a way to complicated and provide an overkill of information. Therefore it is best to base your analysis on the raw data but share figures based on averages to your stakeholders.

This way, you have done an in-depth analysis on all of the data that is available to you. But now you are still able to share easily understandable graphs with your stakeholder that could be based on metrics like averages or percentiles.

Conclusion

Solving performance problems by using raw data has become my second nature. I believe it is one of the most important lessons I have learned as a performance engineer. To learn more about raw data, I would recommend that you also read Stijn Schepers article "Performance Testing: Act like a detective. Use Raw Data!." or "Performance testing is not an average game!".

Further good reads on the topic of raw data:

I far from the only one talking about the importance of raw data for performance engineers, other amazing peers are also writing and speaking on this topic more and more. For your reference, I have linked a few them down below:

Lets Talk About Averages by Stephen Townshend

Too dependent on percentiles? Read this. by Aashish Bajpai

Performance Testing: Act like a detective. Use Raw Data! by Stijn Schepers

Performance testing is not an average game! by Stijn Schepers

Martijn Remmen

Network Engineer bij KEMBIT

4 年

Using BI tooling seems like a great use case. Thanks for sharing Joey!

1 次回应

Mehdi Daoudi

Driving Internet Innovation | Expert in Monitoring, Reliability, Web Perf & Resiliency | Champion of Customer & Workforce Experiences | Collaborating with Internet Builders & Operators to Deliver Exceptional Results

4 年

great article! further documents to help/support this Velocity 2012: John Rauser, "Investigating Anomalies" https://www.youtube.com/watch?v=-3dw09N5_Aw Velocity 2011: John Rauser, "Look at Your Data" https://www.youtube.com/watch?v=coNDCIMH8bk

2 次回应

查看更多评论

要查看或添加评论，请登录

Joey Hendricks的更多文章

Reimagining Efficiency: A Tale of Homes, Cloud and Performance Engineering

2023年11月6日

Reimagining Efficiency: A Tale of Homes, Cloud and Performance Engineering

Imagine stepping into a house, perhaps a charming fixer-upper, echoing the promise of transformation. This dwelling…

1 条评论
Statistical magic spells to automate performance test result analysis.

2023年2月7日

Statistical magic spells to automate performance test result analysis.

During the 2022 holiday season, my friends and I decided to have a nineties-themed movie night, armed with a lot…

6 条评论
Should we be hoarding gold like a dragon?

2022年5月23日

Should we be hoarding gold like a dragon?

You step into the inner sanctum of an ancient tomb and gander upon a vast field of gold. As you step forward into the…

8 条评论
Don’t lose your mind over slow code check your performance sanity

2020年6月11日

Don’t lose your mind over slow code check your performance sanity

Did you ever run into a problem that the IT application you were testing behaved like a couch potato? Getting a slow IT…

4 条评论
Stumbling Head First Into Performance Engineering

2020年5月6日

Stumbling Head First Into Performance Engineering

By writing this article, my goal is to provide a helping hand to anyone that is trying to or thinking of starting a…

9 条评论

See all articles

How to get a better understanding of your performance test results

Joey Hendricks

Principal Performance Engineer at Algemene Pensioen Groep (APG)

What is raw data?

Can we trust averages?

Why aggregation over performance metrics makes little sense

How to get raw data out of Neoload

How to visualize raw data with powerful BI tools

No Silver bullet

Conclusion

Joey Hendricks的更多文章

社区洞察

其他会员也浏览了

Apparity Updates: Q3 2023

10 Tips for the Data Migration Process

Diagnose a System Slowdown in Two Minutes

Data Acquisition: The Engineer’s Favorite DAQ

Employee Spotlight: Q&A with Joshua Paterson

Troubleshooting Bad Data due to Logging Errors

Anatomy of a Healthy Data Quality Project Team, Part II: DQ Team Culture, by Laurent Weichberger (August 2023).

Data Structure

Data Modeling made easy with application models

Easily Produce Test Data For Visibility Events At Large Scale

What is raw data?

Can we trust averages?

Why aggregation over performance metrics makes little sense

How to get raw data out of Neoload

How to visualize raw data with powerful BI tools

No Silver bullet

Conclusion

Joey Hendricks的更多文章

Reimagining Efficiency: A Tale of Homes, Cloud and Performance Engineering

Statistical magic spells to automate performance test result analysis.

Should we be hoarding gold like a dragon?

Don’t lose your mind over slow code check your performance sanity

Stumbling Head First Into Performance Engineering

社区洞察

其他会员也浏览了

Apparity Updates: Q3 2023

10 Tips for the Data Migration Process

Diagnose a System Slowdown in Two Minutes

Data Acquisition: The Engineer’s Favorite DAQ

Employee Spotlight: Q&A with Joshua Paterson

Troubleshooting Bad Data due to Logging Errors

Anatomy of a Healthy Data Quality Project Team, Part II: DQ Team Culture, by Laurent Weichberger (August 2023).

Data Structure

Data Modeling made easy with application models

Easily Produce Test Data For Visibility Events At Large Scale