Test Engineer Perspective-Part 1:

Test Engineer Perspective-Part 1:

When we build a high availability and enterprise scale platform, it is important to measure the availability and reliability before going live.

But the question arises, is it feasible to do it given the constraints of infra, time and sufficient data points?

Would love to hear opinions from like-minded professionals.

Personally, based on my experience in evaluating such kind of platforms, would follow the below approach - ?

1.? Setup test environment like production with all external interfaces connected with simulators.

2. Setup tools to generate input traffic like K6, JMeter, Locust etc.

3. Setup APM tools like Dynatrace, Grafana, ELK, cloud-native services etc.

4. Setup alerts in application stack, network, and server resources (CPU, Mem, storage etc.). One of the tools for this is Nagios or one can create custom scripts if feasible.

5. Create a workgroup of experts having skills of performance testing, network/Infra, Dev, DB and Ops.

6. Identify the metrics to be collected. In this context, we need to collect the following.

a. Uptime

b.?Downtime

c. MTBF – Avg time taken between consecutive failures.

d. MTTR – Average time taken to restore the platform.

7. Execute the endurance test for longer duration say for a week. Observe the platform and its components. And leverage the respective tools as mentioned above to record following -

- Number of failures

- Critical alerts

- Failure duration

- Time taken to repair/restore after failure

8. The interesting aspect is what if the platform does not fail. It sounds great; (however, it is unlikely ??).? So, we need to use chaos engg on critical components during the test run. And observe the above parameters.

?9. Calculate the availability as

(Uptime/Total Test Duration) *100

Where, Uptime = Total Test Duration - ∑ Downtime

10.? Calculate the MTBF as

(∑ Up Time)/Total No of Failures

11.? Calculate the MTTR

∑ (Down Time)/Total No of Failures ?

12.? Calculate reliability as

? (MTBF/ (MTBF + MTTR)) * 100

??

Would like to cover more practical aspects in my next post.

要查看或添加评论,请登录

Arif Chauhan的更多文章

  • A Test Approach for Data Migration

    A Test Approach for Data Migration

    The consolidation of businesses continues to happen for obvious reasons particularly in Telecom, Banking, and the…

    2 条评论

社区洞察

其他会员也浏览了