Analyzing Performance Test Results - System Level Performance
RadhaKrishna Prasad
PerformanceEngineeringSME | SRE | Corporate Trainer - Performance Engineering | CloudPerformanceTesting | Chaos and Resilience | Observability | DevOps |
Analyzing Performance Test Results:
System-level Performance:
At this stage the analysis, we will mainly focus on the Network, Servers, CPU utilization and Memory issues at different layers in the architecture to identify the performance issues.
CPU Issues:
The analysis focus on the level of processor utilization on Web/App and DB servers. One way of detecting CPU degradation is to check if the CPU utilization is spiking with an increase in the number of jobs to be processed by the processor. Using Site Scope graph data combine the CPU Utilization with its Processor Queue. If the processor queue is building up with CPU utilization at a higher level, then there is a performance problem occurring at that layer of the application.
Using Site Scope graph monitor the CPU-Ideal time, when the CPU is 0% ideal it means its 100% utilized. Correlate the CPU Utilization with Response time, check if the response time spikes with the CPU Utilization.
Correlate the CPU Utilization with Component level metrics( like thread count, connections, memory utilization) to identify the cause of high CPU.
Bandwidth Issues:
These issues arise when the Network between the Source and destination systems are fully utilized. The following network graphs can help the user resolve Bandwidth issues.
· Network Delay time graph - Shows the complete delay between the source and destination path.
· Network Sub-Path Time Graph - Shows the delay from the source machine to each node on the path.
· Network segment Delay Time Graph - Shows delay in each segment in the path itself.
While analyzing for network bandwidth check if the throughput increases with Response time, then there may be an issue with the Network Bandwidth. Use the “Time to First Buffer graph” to confirm the network utilization against Server Utilization. If the Connection time is high and the Time to First Buffer time is low then it’s a clear indication that the network is a problem for poor performance.
Load balancing issues:
The Symptoms for Load balancing issues could be virtual users experiencing fluctuations in response time. VUsers from one host may have low response time while the others may have a high response time. This is due to the reason that the load from this host is not balanced to their respective Web/App/DB servers.
While analyzing for load balancing issues do the following confirmations:
· Check if the CPU Utilization of the servers is very high or very low compared to the others.
· Check for the number of connections the server is handling is too high or too low compared to others.
· Compare the web/app/DB server CPUs connections and thread counts from site scope and web/app/DB Server monitors.
Correlate the “Running VUsers with the Connections” the various servers are handling during the point of time. As the VUsers ramp up the number of connections the server handled should also ramp up, which is the Expected behaviour.
Connection Issues:
When the Total number of Connections to the server reaches the top limit of the web-server. We may face Connection Errors. On the Web Server look for max-connections or server connections. Change this configuration setting to a higher number and restart the server. Web Server connections in the graph stop increasing beyond the allocated max connections.
Correlate the “Number of Vusers” (running Vusers Graph) with “Number of Errors”, so when the number of Vusers reaches beyond the threshold the Connection errors start to spike.
Threading Issues:
The Response times increases, as the number of thread connections to the Database/App Server, increases beyond the configuration settings available in the Server. In the corresponding server graph resource utilization, see for the monitored connection thread pool and if does not increase beyond the specific level, it would be a clear indication that the connection thread pool setting should be increased.
JVM Memory Issues:
When the response time fluctuates between high to low and then back to high, then there is an indication for Memory issue. Look at the JVM size and the Heap Utilization, Correlate the response time with Heap Utilization to analyze the impact. Heap Utilization comes down when garbage collection is done.
This article is limited to my experience and there could be many unique reasons for Poorly written code, unoptimized databases, traffic spikes, Poor Load distribution etc., for performance issues depending on the architectures and tools that we use for performance testing. It is important to remember that once you have all the addressed performance issues you must first eliminate them before it becomes the actual pain for users. Every application is unique and its issues are also different in its own way.
One last point to make a note:
Most importantly, the faster the product/application, the more revenue it will generate
Thanks for reading the article! Happy Learning:)
Performance Engineer
6 年Good one ?? helpful
Staff Performance Engineer at GE
6 年Great Attempt ?? Good learning curve and basic foundation for a Engineer . Thanks for sharing