Navigating the Cloud: Early Performance Validation for a Successful Migration - Part VI
Vijayanathan Naganathan
Tech Co-Founder | Driving QE Innovation for Growth-Stage Companies | Customer Success Leader | IIM Kozhikode Alumini
Hello All!
So far in our series you would have read about performance requirements, strategy, scripting. In this part let us look at test execution, monitoring and results interpretation.
To begin with, let us emphasize on some of the aspects to setting up the performance test environment. It is important to understand how the application is intended to be setup in terms of technical architecture, deployment architecture and from where and how end users will be accessing the #applications . This understanding will help in setting up the load simulation infrastructure from the cloud, whether port opening needs to be made, from which all regions the load simulation need to happen.
Once test setup in complete from simulation perspective, perform sanity runs to ensure loads are injected, monitoring systems can capture the server-side metrics, logs are accessible. It is important to ensure that right logging modes are setup in application side – info mode during #performancetesting & debug mode during troubleshooting – as debug mode tends to generate more logs that would serve as an overhead on performance. It is also to be noted that the right kind of monitoring – intrusive/non-intrusive solution - should be put in use depending upon the project needs, post assessing the pros and cons of such monitoring solutions.
Once all parts of the load simulation and monitoring solutions are working fine, sanity run confirms good, look at running small scale load tests to ensure the tests runs for the desired duration, it can simulate the desired behaviour and clear indicators/inference are being able to be made from the peak user load tests based on the workload models defined for the tests. This would also require one to ensure that there is adequate data needed for the load tests are available. Accuracy of #data and co-relation of parameters between different requests fired is also important. For instance, in an order management application, if 1000 orders are fired and only 400 orders are successfully created, that would not create enough load for the #downstreamprocessing .
Based on the workload model defined involving transactions, wait times, anticipated response times, arrive at the overall transaction/request count for the given number of users. Once the test is done, validate the test results like number of transactions completed during the test duration apart from client-side response metrics like response time, 90% response times, error % etc. This will give a fair idea is you are able to achieve the desired transaction count and throughput based on the workload model defined.
领英推荐
The client-side metrics tend to give you performance from end-user perspective, but that indicates the client side experiences alone. You will need to look beyond them to understand where performance issues are arising from. You will need to focus on the server-side metrics like server utilization, memory utilization across web, application and database server side.
A key trait to develop in performance test execution and monitoring is to run test, observe the client side and server metrics, dig down into logs/traces to the last method call level with solutions like AppDynamics/New Relic whenever response times are high or when the utilization of resources go up to the thresholds. Observability is an often-overseen skill that can add real value in spotting problems, going to root of those problems. It is important to repeat these tests until root causes are sighted and analysed. Often, it is good to start with a low base – for instance, if the objective is to carry out the load tests for 1000 concurrent users, it would be better to start with 250 users and increment it to reach the destined level. Collaboration with developers, infrastructure support teams, ability to engage with architect will be a skill to be honed for performance test teams.
Post the peak user load tests are done and performance test results are in line with expected SLAs, the focus could shift on to running endurance tests that can help ensure memory leaks are unearthed during long run test durations. Memory leaks can lead to undesirable user experiences such as performance degradation, outages, etc. Typically, endurance tests have similar workload but run for a prolonged duration. Often, memory leaks may not be avoidable especially in the browser-based applications as there are a lot of inefficiencies in clearing memory from the browser perspective – in such cases, restarting the browser once in 2-3 days may be a workaround (undesirable, but unavoidable at times).
One of the aspects of performance test is to focus on performance of current or anticipated usage but also to look at identifying the point of failure or the point at which the performance begins to degrade. Stress tests can be great means that can lead to identifying these breakage points.?Gauging performance test results can really help decide product go-live, no-go decisions.
To summarize, in today’s modern cloud-based architecture, performance of an application is often overlooked because of the infinite amount of processing capability one can bring in, often in the name of elastic cloud – often this comes with an additional cost and running a production infrastructure with an inefficient, high-cost infrastructure could often decide between the success and failure of the business.?It is very important that load, stress and endurance tests (sometime even volume tests based on the business requirement) are carried out before a large scale cloud implementation is carried out.