ChatGPTDD

ChatGPTDD

Test Driven Development with ChatGPT

Context

Recently, I set out to solve a fairly complex coding problem. I've been using ChatGPT as my coding assistant for a while, but the difference this time was that I wanted to approach this using Test Driven Development (TDD).

This was inspired by several conversations with Dave Hudson and Cyrus Nouroozi , both of whom have been investigating how Large Language Models (LLMs) can be leveraged to build tools that generate code.

With this inspiration in mind, I wanted to see how well ChatGPT could handle a challenging problem using TDD.

The Problem

The task I set out to solve was deceptively simple at first glance, but in practice, it was full of complexity. The challenge: given a C# type (for example, typeof(byte), typeof(List<>), typeof((string, Guid)), etc.), implement a method that outputs the correct C# type declaration for that type. However, the format of the type declaration had to be configurable via a flags enum, with the following rules:

  1. No flags: If no flags are set, the type declaration should be formatted using its simple type name.
  2. UseNamespaceQualifiedTypeNames flag: If this flag is set, the type declaration should use the namespace-qualified type name, if available, and fall back to the simple name if not.
  3. UseAliasedTypeNames flag: This flag formats the type using its C# type alias (if it has one). This supersedes the namespace-qualified name rule but falls back to that if no alias is found.
  4. UseNullableShorthandTypeNames flag: When set, nullable value types should be unwrapped from their Nullable<T> wrapper and formatted with a "?" suffix. This flag takes precedence over any generic type rules.
  5. UseGenericTypeArguments flag: If enabled, the type declaration should include its generic argument types.
  6. UseValueTupleSyntax flag: This formats types of ValueTuple with more than one argument using tuple syntax (e.g., (T1, T2) instead of ValueTuple<T1, T2>). It overrides the UseGenericTypeArguments rule.
  7. UseValueTupleNames flag: When set, it formats ValueTuple types using tuple syntax and includes parameter names for each argument (e.g., (T1 Item1, T2 Item2)). This flag overrides both tuple syntax and generic type argument rules.

Additional Considerations

The method, GetTypeDeclaration, must also handle nested types recursively. For instance, the same rules should apply to generic argument types within a type. While the rules are well defined, as you can imagine, this is not a trivial problem.

The Solution

Although I’m not dogmatic about TDD, on this occasion it was definitely the right approach for this task. So, I started by writing a failing test. The initial test suite included about 50 test cases — enough to cover most of the main code paths and edge cases.

The Lightbulb Moment

Once I had my failing test cases, I had a sudden realisation: Why not just feed all of this — the test, test cases, the flags enum, and a description of the problem — into ChatGPT and see what it generates?

To clarify, this approach differs from my usual process, where I typically describe the problem, receive some code, refine it, and then proceed to testing.

To my pleasant surprise, ChatGPT produced a working implementation that passed around 80% of the initial 50 test cases. This was a solid foundation, but there were still gaps. So, I fed the failed test cases back into ChatGPT, providing explanations of the failure reasons. I also added more test cases as I uncovered additional edge cases, continuously refining the code.

By the end of the process, I had a suite of 571 test cases, all of which passed!

Was it Perfect?

No. I’m not going to pretend that ChatGPT wrote the most elegant or efficient code. In fact, the code wasn’t particularly clean, and in some areas, it was downright messy. However, if we follow Kent Beck’s mantra — “Make it work. Make it right. Make it fast” — ChatGPT achieved the first step: it worked.

I spent another hour or two manually refactoring the code into something more beautiful, efficient, and elegant. While beauty in code is subjective, the final solution was something I was satisfied with. Most importantly, running all 571 test cases took just 21 milliseconds, so the solution was not only correct but fast.

Conclusion

This was my first time using ChatGPT in a full TDD cycle. While I’ve used ChatGPT for development tasks before, this process felt different. There was a lot of back-and-forth interaction between ChatGPT and my IDE. While the workflow wasn’t as smooth as I would have liked, the overall outcome was a success.

Using ChatGPT in a TDD context worked surprisingly well. It helped automate a significant portion of the grunt work and sped up the process of iterating on my code. I’ll admit, the code wasn’t perfect straight out of ChatGPT, but it provided a functional foundation that I could refine.

For developers curious about pairing TDD with ChatGPT, I’d say it’s definitely worth experimenting with. Even if the code isn’t immediately production-ready, ChatGPT can handle the heavy lifting, letting you focus on the polish and optimisation afterward.

By the end of the process, I had a complex problem fully solved, 571 passing test cases, and an efficient piece of code, all thanks to ChatGPT-driven development.

Would I use ChatGPT for TDD again? Absolutely.

Christopher Diggins

Entrepreneur, Computer Scientist, Public Speaker, and Instructor

1 个月

This is a very interesting approach. Automating it via the ChatGPT API would be an interesting next step!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了