登录查看更多内容

Machine Learning and Load Testing: When Was the Last Time You Ran 60 Load Tests in a Day?

Jakub Dering

Tech Lead For Performance Engineers/ Conference Speaker

发布日期: 2025年2月6日

This article is my review of Akamas (Akamas.io), a solution to optimize your platform configuration, using machine learning. It's not sponsored and contains my opinions and impressions of the free trial. Make sure to check them out if you think you're too busy for tweaking and tuning.

"If it works, don't touch it"

It's a common rule not to disturb a well-defined and well-tested code base or configuration without justification. The main motivation for this rule is usually to prevent the cascading effect of any change brought to the system. It's never "just a line of code" or "just one property". Modern systems and applications are convoluted, and an improvement to one resource may result in a regression in another. An increase in memory size may result in longer garbage collection (GC) pauses. An increased thread count that was supposed to increase your throughput suddenly creates thread contention, slowing your application down. Batching requests improves resource efficiency, but at the cost of end-to-end times. A configuration that works in isolation doesn't necessarily guarantee the same results after integration or deployment on another platform.

Application performance is full of trade-offs like that, and reaching the perfect balance of resource usage and user satisfaction takes a lot of expertise and resources—and mostly, trial and error. These trials and errors come at a cost, and depending on your testing maturity, this cost can get high quickly. Apart from the costs of hardware, the workload required to run each test, and, of course, the time spent analyzing the results and preparing the report, the biggest cost is the time not spent on testing new deliveries, effectively blocking the delivery pipeline.

Prerequisites

Meeting the prerequisites for testing with Akamas is...worth it on its own. The operational requirements are really just a set of good practices, so I'll list them out just in case you're still testing like it's 2011:

Continuous Testing: Whenever you hear "performance" and "continuous" in the same sentence, it typically means an independent, repetitive test at any given time, with no human intervention. Problems like data drifting, storage limits, and bloated database tables need to be fully automated so they don't affect the next test run. This also means you can't update that one property before each test because it was once entered as an attribute, or transfer this file, or generate new test data. You get the idea. I want to run a test, and it's running in a few minutes. The time between test runs determines the accuracy of your improvements. Machine learning requires lots of data, and it's best if you deliver it fast. Consider mocking to minimize the impact of external dependencies on your test results.
Continuous Monitoring: This is for monitoring the KPIs and SLAs. You should be able to monitor these and let the Akamas engine parse them. Akamas integrates with major observability providers, particularly Dynatrace, but even if you don't use any, it won't say no to a structured CSV file to gather the metrics, or an exposed Prometheus endpoint.

Using Akamas

Once you have your app integrated, you start by creating a study. This is a prime example of enforcing best practices because you can't create a study without defining a goal. Let's cover the basic elements of Akamas.

Main dashboard with my studies and their status

Akamas Study

A Study is an entity holding your experiment's requirements, criteria, and execution. Before you can run any tests with Akamas, you need to define the study goals. The most common ones are minimizing cluster costs, reducing response times, or lowering error rates. It could also be all three. This is probably the most important part of test execution because the parameters will be scored and weighted based on the outcome of these goals.

Goal Constraints

You can specify experiment assertions and flag them as failed if they don't meet certain criteria. These can be either absolute or relative to the baseline. The most common ones would relate to your SLOs, like response times or error rates. You could also specify your own, as long as Akamas has access to the metrics.

KPIs

The Akamas engine will automatically add your study goals and constraints as KPIs. These KPIs will be the main factors for auto-tuning and will be used to determine the most optimal configuration. You can also add custom KPIs in this step. They serve as a guide for parameter definition.

Windowing

f you've ever tried comparing multiple test runs, you'll learn that they won't always look the same. Even if you run the same workload against the same application, some external (or internal) factors can cause instability in your load test. Running the same test on a cold vs. warmed-up environment may yield different results. To overcome this challenge, Akamas can automatically pick a timeframe from the test where the metrics are the most stable. By default, you can trim the metrics yourself if you know the ramp-up in your test is enough to warm up the environment.

Parameters

These are your inputs, and the Akamas engine will try optimizing them with each experiment (iteration). You can specify the exact parameters you want your experiments to focus on, or, if you want to take full advantage of the engine and test out many possibilities, you can just mark them all and let Akamas decide which parameter might help you achieve your goal.

Running Studies

This is the stage where the magic happens. The convenient part is that you don't have to analyze every single run. The best part is to leave the application running for the weekend and come back on Monday to see the best configuration for your setup. You do, of course, get insights for each test run and see the impact of each parameter change.

Akamas starts with application baselining—that's your reference, using default or hinted parameters. Any deviation from the baseline KPIs will be included in the study scoring and let Akamas decide whether to pursue further changes to specific parameters.

领英推荐

Stubbing and Verifying: My Journey to Smarter Testing…

Keploy ?? 1 个月前

Fuzzy snapshot testing with jq and diff

Measures for Justice 1 年前

How To Create A Grafana Dashboard For Keploy: Easy…

Keploy ?? 4 个月前

Algorithm

At its heart, the Akamas optimization algorithm is an implementation of reinforcement learning. This means more accurate, efficient, and predictable results for multidimensional time-series data inputs. If you really want the details, they've published a paper on the algorithm's behavior:

https://15799.courses.cs.cmu.edu/spring2022/papers/07-knobs2/p1401-cereda.pdf

And a granted patent in the US:

https://patents.google.com/patent/US11755451B2/en

The key addition to the standard reinforcement algorithms you might know is the adaptability to varying conditions of your workload and environment state. Noisy neighbors, partial environment instability, sudden workload changes—all of that may generate false signals to the engine. You can read all the details in the published papers.

Visual Representation of a 3D path finding algorithm based on reinforcement learning.

Study Results

In my example, Akamas managed to cut down the response times by 28%, just by modifying my heap size parameters. It did well tuning undefined parameters. At first glance, the heap size remained the same, but tweaking the new size yielded the best results. For the GC type, it chose Parallel, which is most suitable for small heap applications. It also tried other algorithms, like G1, Serial, and ConcMarkSweep, but looking at the numbers from the experiments, Parallel had the biggest potential for this workload and application. I was very pleased to see the progress and accuracy of the test results. Setting a fixed new space is not the first choice of developers and is usually a last resort when dynamic sizing doesn't fulfill the criteria, but based on the end result, it turned out to be a good option.

Technology Stack Support

Akamas comes with a plethora of optimization packs. I've only tried the JVM one, but the number of supported technologies is listed below. Each optimization pack comes with predefined parameters with already defined constraints. You can also combine optimization packs and set up your study to tune both your application and your Kubernetes cluster for your workload to ensure they fit together. One of the free trial studies showcases the optimization of a JVM application running on Kubernetes.

It also supports the major Load Testing tools out of the box, as well as the industry-standard monitoring tools.

Risks

Overfitting: Like all machine learning algorithms, reinforcement learning is prone to overfitting. This means that after too many iterations and training, the provided parameters would only work for your test workload and only for the specific application conditions. The accuracy of the optimization would be greatly dependent on how well your load tests represent your actual workload in production. The tests you run also rarely include the "ghost" requests untraceable by APMs, and they may contribute greatly to your overall application utilization. Keep that in mind before applying overfitted parameters to your production system. You should also add additional KPIs to make sure you leave some resources available for an unforeseen load, like an 80% CPU constraint, to leave some buffer for unforseen load.

Testing in Production

If you don't have a designated performance team or suitable tests to simulate your production workload, Akamas has an option to perform a Live Study, which is optimisation on your live system. You can let Akamas monitor your current production metrics and provide recommendations to be applied manually by your team. Because I don't have a production system at the moment, I couldn't try this option, but here's the details in case you're interested.

https://docs.akamas.io/akamas-docs/using/study/live-optimization-studies

Chances

Exploration and Exploitation: Akamas doesn't have "best practices" embedded into its engine. And that's a good thing because it's not biased towards a configuration propagated by all the books and articles you can find online. Why? Because most often, the recommended parameters are for general use cases, and your application might have a specific footprint where standard configurations will not apply. This allows you to easily explore new options and configurations, and if they're proven beneficial, find their absolute limits by exploiting surrounding parameters.

My Impressions

Akamas provides innovative approach to tuning your application - with a sole focus on the KPIs and SLOs (which is all that matters) and a well defined goal, turning your application into a black box. The more distributed your system, the more resources and costs you can cut down with your experiments. For monolith applications, the risk of overfitting might set you back from the "best" recommendations available to more stable but still more beneficial in the long run.

One thing I didn't cover, which is how Akamas is intended to be used in the enterprise setup, is the support of CLI.

Does it substitute a domain expert, if it yields equal or better results? One thing it doesn't tell you, is why the parameter applied changed the behaviour of the application. Understanding of the impact of the change helps shaping your application structure for future. That's why an oversight of a person familiar with the domain might be still required to make sure the configuration is also future-proof and will scale properly. On other hand - you could always just run a new study with the new version and have a completely new configuration each release.

It's definitely worth trying, especially if you have a lot of exploration tasks in your backlog that you know will take a long time, like trying out a new GC algo or simply cut down the costs of your app. Akamas can do it for you. They have a free trial available here if you want to check them out.

Stéphane Mader

Senior-PM(NeoLoad)@Tricentis - Associate@TimeForThePlanet

5 天前

Fatma Slaimi Bruno Duval Thibaud Bussière

1 次回应

Sravanthi Naga

Talks about ?Performance Engineering ?DevSecOps ?Kubernetes ?SRE ?People Leader

3 周

Great review Jakub Dering Its quite useful

1 次回应

Jakub Dering

Tech Lead For Performance Engineers/ Conference Speaker

3 周

Stefano Doni Luca Forni

1 次回应

查看更多评论

要查看或添加评论，请登录

Jakub Dering的更多文章

Speedscale Review - Are load tests relics from the past?

2024年11月23日

Speedscale Review - Are load tests relics from the past?

Every once in a while I get approached by a company targeting the domain I create content in: Performance Testing. This…

14 条评论
**Global Technical Debt Nears 7 Billion Story Points, Calls for Market Regulation Grow**

2024年6月6日

**Global Technical Debt Nears 7 Billion Story Points, Calls for Market Regulation Grow**

The global technical debt is on the verge of reaching a staggering 7 billion story points, sparking discussions about…

8 条评论
Instrumenting Load Tests with OpenTelemetry

2023年12月21日

Instrumenting Load Tests with OpenTelemetry

The first and most pertinent question you might ask is: Why should you integrate telemetry with load testing? After…

18 条评论
Don't write the fastest code

2023年4月17日

Don't write the fastest code

Yes, it's coming from a performance engineer I came across this video on youtube, it's getting pretty famous but the…
JMeter is my bottleneck!!!*

2023年3月13日

JMeter is my bottleneck!!!*

Yes, you read that right - I'm guilty of using JMeter for high volume-low throughput testing. SLAs for most of my tests…

16 条评论
What if…? Anatomy of an if statement

2023年2月17日

What if…? Anatomy of an if statement

Introduction A while ago during code review related to sudden performance degradation, I came across a set of new…

2 条评论
Performance Testing using Neocortix LoadTest and JMeter

2022年2月22日

Performance Testing using Neocortix LoadTest and JMeter

Disclaimer: This article is not sponsored, contains no product placement and the opinions are my own - as all my…

4 条评论
Global Performance Testing - part 2: Test design

2022年2月20日

Global Performance Testing - part 2: Test design

Designing performance tests Performance test design is an extremely important part of your work as the accuracy of your…

7 条评论
Global performance testing- part 1

2022年2月15日

Global performance testing- part 1

And so it begins..

4 条评论
Know Your Transports - Performance testing guide for integrated systems

2022年2月13日

Know Your Transports - Performance testing guide for integrated systems

Disclaimer: This article is for performance testers testing applications integrated via queues and topics, if you're…

3 条评论

See all articles

Machine Learning and Load Testing: When Was the Last Time You Ran 60 Load Tests in a Day?

Jakub Dering

Tech Lead For Performance Engineers/ Conference Speaker

"If it works, don't touch it"

Prerequisites

Using Akamas

Akamas Study

Goal Constraints

KPIs

Windowing

Parameters

Running Studies

领英推荐

Algorithm

Study Results

Technology Stack Support

Risks

Testing in Production

Chances

My Impressions

Jakub Dering的更多文章

社区洞察

其他会员也浏览了

How to build ML pipeline + other resources

Ensuring Integrity in Digital Systems: The Importance of Right Shift Testing

Docker Monitoring Market & Future Challenges

Levitating with a Roman god

Unpacking .NET 9.0: Practical Gains for Real Impact

Setting up Claude Filesystem MCP

This is not the end yet

Texport – Alfresco Exports & Imports

Mastering Observability with OpenTelemetry

Everything You Need to Know to Design, Implement, and Avoid Pitfalls in A/B Testing

"If it works, don't touch it"

Prerequisites

Using Akamas

Akamas Study

Goal Constraints

KPIs

Windowing

Parameters

Running Studies

领英推荐

Algorithm

Study Results

Technology Stack Support

Risks

Testing in Production

Chances

My Impressions

Jakub Dering的更多文章

Speedscale Review - Are load tests relics from the past?

**Global Technical Debt Nears 7 Billion Story Points, Calls for Market Regulation Grow**

Instrumenting Load Tests with OpenTelemetry

Don't write the fastest code

JMeter is my bottleneck!!!*

What if…? Anatomy of an if statement

Performance Testing using Neocortix LoadTest and JMeter

Global Performance Testing - part 2: Test design

Global performance testing- part 1

Know Your Transports - Performance testing guide for integrated systems

社区洞察

其他会员也浏览了

How to build ML pipeline + other resources

Ensuring Integrity in Digital Systems: The Importance of Right Shift Testing

Docker Monitoring Market & Future Challenges

Levitating with a Roman god

Unpacking .NET 9.0: Practical Gains for Real Impact

Setting up Claude Filesystem MCP

This is not the end yet

Texport – Alfresco Exports & Imports

Mastering Observability with OpenTelemetry

Everything You Need to Know to Design, Implement, and Avoid Pitfalls in A/B Testing

Global Technical Debt Nears 7 Billion Story Points, Calls for Market Regulation Grow