Reliability of Load Testing
With the availability of on-demand cloud servers, we can generate a production size load for performance testing. It is not uncommon that during the test, one or more load generating instances misbehaves and even crash to spoil the test. Retry is expensive in performance testing. The higher the number of load generators in the test, the more likely something will go wrong. Say, one server's reliability is 99%, and you need 20 such servers to generate peak load. When you generate load simultaneously from 20 servers, the system's reliability is not 99%, but (0.99)^20 ~82%. Why is this important?
- When you are scheduling and planning performance tests, make sure to have some buffer in your plan, and not assume every test will complete successfully.
- When choosing a solution to run a load test, choose a system that can generate more load from a single server.
- When building a homegrown system using open-source, optimize each server to maximize its output, rather than just adding more servers.
Building Temperstack | AI Agent for Software Reliability | AI SRE Agent
1 年Anil, ??