week46 - Evaluating Large Language Models in Detecting Test Smells and Code Reviews Patterns and Anti-patterns
Evaluating Large Language Models in Detecting Test Smells
Test smells are coding issues that typically arise from inadequate practices, a lack of knowledge about effective testing, or deadline pressures to complete projects. The presence of test smells can negatively impact the maintainability and reliability of software. While there are tools that use advanced static analysis or machine learning techniques to detect test smells, these tools often require effort to be used. This study aims to evaluate the capability of Large Language Models (LLMs) in automatically detecting test smells. We evaluated ChatGPT-4, Mistral Large, and Gemini Advanced using 30 types of test smells across codebases in seven different programming languages collected from the literature. ChatGPT-4 identified 21 types of test smells. Gemini Advanced identified 17 types, while Mistral Large detected 15 types of test smells. Conclusion: The LLMs demonstrated potential as a valuable tool in identifying test smells.
领英推荐
Effective Teaching through Code Reviews: Patterns and Anti-patterns
Code reviews are an ubiquitous and essential part of the software development process. They also offer a unique, at-scale opportunity for teaching developers in the context of their day-to-day development activities versus something more removed and formal, like a class. Yet there is little research on effective teaching through code reviews: focusing on learning for the author and not just changes to the code. We address this gap through a case study at Google: interviews with 14 developers revealed 12 patterns and 15 anti-patterns in code reviews that impact learning. For instance, explanatory rationale, sample solutions backed by standards, and a constructive tone facilitates learning, whereas harsh comments, excessive shallow critiques, and non-pragmatic reviewing that ignores authors' constraints hinders learning. We validated our qualitative findings through member checking, interviews with reviewers, a literature review, and a survey of 324 developers. This comprehensive study provides an empirical evidence of how social dynamics in code reviews impact learning. Based on our findings, we provide practical recommendations on how to frame constructive reviews to create a supportive learning environment.