Automated Test Case Generation - LLM-Based Software Engineering
(Source: https://gennaspiritedaway.wordpress.com/2017/11/19/chihiro-and-the-boiler-man/, Spirited Away)

Automated Test Case Generation - LLM-Based Software Engineering

First there was Dev. Just Dev.

Some years passed, and this changed to DevOps. Dev + Ops. Two worlds, but one person inhabiting them both.

Now, we've got DevSecOps. Dev + Security + Ops. Three worlds, and one person scurrying from one to the other at all times. With two-pizza teams, the Tech downturn developers are stretched to capacity. It reminds me of one of the characters from Hayao Miyazaki's film Spirited Away - Kamaji. Kamaji is the boiler man serving the bath house where most of the film is set. He is a yōkai[1], half-spider, half man.

If there's a desperate need for help somewhere, it is to come to the aid of the developer. Can GenAI help?

A paper authored by engineers from Meta hopes so. The paper is titled Automated Unit Test Improvement using Large Language Models at Meta[2]. Taking care to remind readers that the aim is augmenting human capacity, not replacing them, the authors lay out a novel use of GenAI in the software development lifecycle. Enhancing unit tests.

Yes, unit tests, that highly necessary but often involved aspect of software development. You'll hear many a developer (including me) say nowadays with mocks, spys and the like sometimes good, thorough unit tests take longer than the actual code.

This paper feeds in the class under test and the existing unit test as prompts to the LLM, requesting that the test class be enhanced. Out comes the enhanced test class. Just as distributed systems designers have trained themselves to design assuming failure, AI application developers are encouraged to design assuming hallucinations.

Here, taking direct aim at hallucinations, the authors propose a three-phase filtration process.

The generated test class must -

1. Compile correctly

2. Is not flaky (i.e. if run N times, will stably pass all N times)

3. Improve test coverage

If the generated test class meets all three criteria, it is presented as a diff, to be merged to the code base.

This could pair nicely with a code-specific foundational model (such as the new IBM Granite granite-34b-code-instruct[3]).

Kamaji would be happy to have such help!

[1] Yōkai are a class of supernatural entities and spirits in Japanese folklore.

[2] https://arxiv.org/abs/2402.09171

[3] https://www.ibm.com/products/watsonx-ai/foundation-models


要查看或添加评论,请登录

Rick Banerjee的更多文章

社区洞察

其他会员也浏览了