My review of the AI-based test generation tools

With the recent advance of the generative AI models, many new AI-based test generation tools emerged. I’d like to share my experience so far.

TLDR, the AI-based test generation tools aren’t mature yet but have a lot of potential of solving the inefficiency of testing software. Their probability-based nature suggests a more reliable tool that gives developers complete control can complement them.

Here are the products I have researched:

  • ChatGPT
  • TestPilot from the team that made GitHub copilot. It’s in preview for paid copilot users. It currently works only with JavaScript/TypeScript.?
  • CodiumAi It’s free and supports multiple programming languages
  • DiffBlue It supports Java only. It’s the only one that claims it is not based on LLM( large language model).?
  • CodeUtils It supports multiple programming languages. However, unlike others, it is not integrated with major IDEs.
  • SapientAi It supports Java only. Free

Related products:

Usability

All of the products I have tried have usability issues. That said, they are a giant leap forward from the earlier attempts. For simple use cases, they are impressive for being able to predict completely the test inputs and assertions. It’s much more efficient than manually writing unit tests.

SapientAi failed to launch in the IntelliJ IDE.?

CodiumAi still requires copying and pasting generated tests and fixing syntax errors ( such as imports) in generated tests manually. And generating tests takes a while.

Correctness

I tested CodiumAi with my Python project beyond simple use cases.?

The generated tests are not always correct, especially with the more complex functions.? For example, it generated an incorrect target string for a patch statement; made-up class names.

My empirical finding is supported by a paper describing TestPilot. It claims: “evaluated it on 25 npm packages with a total of 1,684 API functions to generate tests for”.? “11.8%–74.4% of the tests generated by TESTPILOT are passing tests, with an overall median of 47.1% across all packages.”?

Like other LLM-based tools, the generated tests are a “best guess”, developers shouldn’t blindly trust them and in many cases, additional debugging and editing are required. Here is an example of ChatGPT’s weaknesses.

One exception is DiffBlue, according to an interview with its CEO, its generated tests will always pass. I haven’t seen enough data to assess how useful the generated tests are though.

Update of generated tests

None of the tools seems to offer ways to manage/update the generated tests. When more tests are generated because it is easier, it is more important to have the ability to easily manage them.

Control

The tools are convenient since they will generate tests with minimal inputs. However, there are times when developers want to have more control. For example, they may want to test a specific input combination; test mocking out one dependency, and use real instances for others; verify the parameters passed to mocked calls.

CodiumAi for example gives some limited options and offers free-form prompts. The free-form prompts didn’t work for me in my tests.

Privacy

All the tools except DiffBlue require transferring the code under test to remote servers.?

My test tool TestScribe and AI-based tools can complement each other

TestScribe is a free, open-source tool. It runs locally; gives you complete control over the tests; always generates correct tests; supports easy updates of generated tests.

Use AI-based tools for simple cases if they work for you and use TestScribe for cases AI-based tools can’t handle well. You may also use the AI-suggested inputs as inputs into TestScribe.

In the future, it is possible to use AI to generate inputs used by TestScribe too.?


Justin Strong

Founder | Software Engineer | Community Builder

1 年

This is a great summary of some of the tools available in the space! I've also been developing a tool for this called DeepUnit which generates JavaScript and TypeScript tests. We always generate passing tests too! Ruigo, I'd love for you to be able to try it out and will offer you and anyone else interested a free GPT-4 Pro Plan for product feedback

Maurus Spescha

Entwickler von MyITest4U

1 年

If you are not impressed by the AI-based test generation tools you could try MyITest4U (https://myitest4u.com/). MyITest4U is able to generate real Selenium based tests for Web applications and test templates for Desktop applications. The generated tests for Web applications can fill out forms, test all cell values of tables and much more. The use of the test generators of MyITest4U does not need any programming skills and they are user friendly.

回复
LiBin Lu

Project Manager | Quality Assurance | Product and Service Delivery

1 年

Thank you for sharing Ray Yang

回复

要查看或添加评论,请登录

杨瑞国的更多文章

社区洞察

其他会员也浏览了