登录查看更多内容

week46 - Evaluating Large Language Models in Detecting Test Smells and Code Reviews Patterns and Anti-patterns

Marabesi Matheus ??

MSc, MBA, Software Craftsperson at Codurance

发布日期: 2024年8月12日

Evaluating Large Language Models in Detecting Test Smells

Test smells are coding issues that typically arise from inadequate practices, a lack of knowledge about effective testing, or deadline pressures to complete projects. The presence of test smells can negatively impact the maintainability and reliability of software. While there are tools that use advanced static analysis or machine learning techniques to detect test smells, these tools often require effort to be used. This study aims to evaluate the capability of Large Language Models (LLMs) in automatically detecting test smells. We evaluated ChatGPT-4, Mistral Large, and Gemini Advanced using 30 types of test smells across codebases in seven different programming languages collected from the literature. ChatGPT-4 identified 21 types of test smells. Gemini Advanced identified 17 types, while Mistral Large detected 15 types of test smells. Conclusion: The LLMs demonstrated potential as a valuable tool in identifying test smells.

See full paper at Arxiv.org

领英推荐

Our NEW 8-Hour AI Crash Course for Developers!

Towards AI 2 个月前

Will A.I. be Able to Augment Programmers? DeepMind's…

Michael Spencer 3 年前

Will AI Replace Programmers? The Big Question Answered

BJIT 1 年前

Effective Teaching through Code Reviews: Patterns and Anti-patterns

Code reviews are an ubiquitous and essential part of the software development process. They also offer a unique, at-scale opportunity for teaching developers in the context of their day-to-day development activities versus something more removed and formal, like a class. Yet there is little research on effective teaching through code reviews: focusing on learning for the author and not just changes to the code. We address this gap through a case study at Google: interviews with 14 developers revealed 12 patterns and 15 anti-patterns in code reviews that impact learning. For instance, explanatory rationale, sample solutions backed by standards, and a constructive tone facilitates learning, whereas harsh comments, excessive shallow critiques, and non-pragmatic reviewing that ignores authors' constraints hinders learning. We validated our qualitative findings through member checking, interviews with reviewers, a literature review, and a survey of 324 developers. This comprehensive study provides an empirical evidence of how social dynamics in code reviews impact learning. Based on our findings, we provide practical recommendations on how to frame constructive reviews to create a supportive learning environment.

See full paper via ACM open access

Papers of the week

801 位关注者

要查看或添加评论，请登录

Marabesi Matheus ??的更多文章

week 53 - Special edition 2024

2024年12月22日

week 53 - Special edition 2024

Last year, the special edition brought insights on the content that was shared through 2023. In this edition, we will…
week 52 - Test Code Refactoring Unveiled, An Improvement to TDD Efficiency and Large Language Models in Detecting Test Smells

2024年12月14日

week 52 - Test Code Refactoring Unveiled, An Improvement to TDD Efficiency and Large Language Models in Detecting Test Smells

Test Code Refactoring Unveiled: Where and How Does It Affect Test Code Quality and Effectiveness? Refactoring has been…
week 51 - why developers implement OS-specific tests, does Treatment Adherence Impact in TDD and A framework for compliance rules for TDD

2024年11月23日

week 51 - why developers implement OS-specific tests, does Treatment Adherence Impact in TDD and A framework for compliance rules for TDD

Besouro: A framework for exploring compliance rules in automatic TDD behavior assessment The improvements promoted by…
week 50 - Agile vs Waterfall, Matching Production and Test Files and Test smells in LLM-Generated Unit Tests

2024年10月20日

week 50 - Agile vs Waterfall, Matching Production and Test Files and Test smells in LLM-Generated Unit Tests

Transition From Waterfall to Agile Methodology: An Action Research Study In recent years, software companies have…
week49 - Monitoring Continuous Integration Practices, Test Smells + Gamification and Unit Test Generation

2024年9月30日

week49 - Monitoring Continuous Integration Practices, Test Smells + Gamification and Unit Test Generation

On the Need to Monitor Continuous Integration Practices - An Empirical Study One of the crucial activities in software…
week48 - Technical debt can damage moral responsibility, LLM for Understandability of Generated Unit Tests and Mimicking Production Behavior with Mock

2024年9月14日

week48 - Technical debt can damage moral responsibility, LLM for Understandability of Generated Unit Tests and Mimicking Production Behavior with Mock

Technical debt, a double-edged sword that can damage moral responsibility This teaching case provides a simple yet…

1 条评论
week47 - Which Combination of Test Can Predict Success? and the relationship between unit test coverage and maintainability of production code

2024年9月1日

week47 - Which Combination of Test Can Predict Success? and the relationship between unit test coverage and maintainability of production code

Do tests really enable change? On the relationship between unit test coverage and maintainability of production code…
week45 - How Do Developers Structure Unit Test Cases and The role of slicing in test-driven development

2024年7月22日

week45 - How Do Developers Structure Unit Test Cases and The role of slicing in test-driven development

Preprint edition! How Do Developers Structure Unit Test Cases? An Empirical Study from the “AAA” Perspective The AAA…
week44 - Highlights in Evidence-Based Software Engineering and Ethnographically Study in the Context of Test Driven Development

2024年7月7日

week44 - Highlights in Evidence-Based Software Engineering and Ethnographically Study in the Context of Test Driven Development

Automatic Assessment of Architectural Anti-patterns and Code Smells in Student Software Projects When teaching…
week43 - Most Common Mistakes in TDD Practice and Reflections on the REST architectural style

2024年6月16日

week43 - Most Common Mistakes in TDD Practice and Reflections on the REST architectural style

Most Common Mistakes in Test-Driven Development Practice: Results from an Online Survey with Developers Test-driven…

See all articles

week46 - Evaluating Large Language Models in Detecting Test Smells and Code Reviews Patterns and Anti-patterns

Marabesi Matheus ??

MSc, MBA, Software Craftsperson at Codurance

Evaluating Large Language Models in Detecting Test Smells

领英推荐

Effective Teaching through Code Reviews: Patterns and Anti-patterns

You might also like:

Papers of the week

801 位关注者

Marabesi Matheus ??的更多文章

社区洞察

其他会员也浏览了

18 Top AI Coding Assistants For Programmers

Open Source Large Language Models (LLMs) in Software Development: Transforming the Coding Landscape

Tab, Tab, Tab ??

The Agony and the Ecstasy of AI Coding Agents

AI-Enhanced Software Development: Automated Coding and Testing

Generative AI Tools Landscape - Coding Applications – Part2

Who is Devika?

MetaGPT: Important Conceptual Advance in Multi-Agent Systems

Code Speak: The New Programming Language

Revolutionize Your Coding Skills with These Top AI-Focused Languages

Evaluating Large Language Models in Detecting Test Smells

领英推荐

Effective Teaching through Code Reviews: Patterns and Anti-patterns

You might also like:

Papers of the week

801 位关注者

Marabesi Matheus ??的更多文章

week 53 - Special edition 2024

week 52 - Test Code Refactoring Unveiled, An Improvement to TDD Efficiency and Large Language Models in Detecting Test Smells

week 51 - why developers implement OS-specific tests, does Treatment Adherence Impact in TDD and A framework for compliance rules for TDD

week 50 - Agile vs Waterfall, Matching Production and Test Files and Test smells in LLM-Generated Unit Tests

week49 - Monitoring Continuous Integration Practices, Test Smells + Gamification and Unit Test Generation

week48 - Technical debt can damage moral responsibility, LLM for Understandability of Generated Unit Tests and Mimicking Production Behavior with Mock

week47 - Which Combination of Test Can Predict Success? and the relationship between unit test coverage and maintainability of production code

week45 - How Do Developers Structure Unit Test Cases and The role of slicing in test-driven development

week44 - Highlights in Evidence-Based Software Engineering and Ethnographically Study in the Context of Test Driven Development

week43 - Most Common Mistakes in TDD Practice and Reflections on the REST architectural style

社区洞察

其他会员也浏览了

18 Top AI Coding Assistants For Programmers

Open Source Large Language Models (LLMs) in Software Development: Transforming the Coding Landscape

Tab, Tab, Tab ??

The Agony and the Ecstasy of AI Coding Agents

AI-Enhanced Software Development: Automated Coding and Testing

Generative AI Tools Landscape - Coding Applications – Part2

Who is Devika?

MetaGPT: Important Conceptual Advance in Multi-Agent Systems

Code Speak: The New Programming Language

Revolutionize Your Coding Skills with These Top AI-Focused Languages