Some Notes on Using AI for Coding

Some Notes on Using AI for Coding

I've been coding more recently -- and using AI a LOT. I don't have a magic bullet or simple recommendation beyond: Use it! Use many tools! Stay up to date!

Here is a somewhat rambling compendium of useful experiences and techniques.

Introduction: The Role of AI in Coding

AI is now an indispensable asset in coding. And it gets better every month – this evolution has been swift and stunning; so much so that I hesitate to write advice that is sure to be obsolete by Christmas.

In late 2023, using AI for programming felt like hiring an intern: helpful but clumsy, often needing close supervision and constant rewrites. Today, AI is a capable collaborator, more like an untiring, chipper, mid-level developer who produces code that’s functional and efficient from the get-go.

Since I am (or was?) an experienced architect and designer it allows me (in particular) to focus on design instead of grunt work. AI is far from infallible, its quirks can be exasperating, and the sheer number of tools are overwhelming; but when used wisely, AI vastly accelerates and improves my work. It even allows me to code a bit without ramping extensively on each new technology I don't have time to learn.

Capabilities and Limitations of AI in Coding Tasks

Strengths in Code Generation

AI produces code quickly, efficiently, and conforming to a wide array of best practices from idioms to proper use of language features - many of which I was not aware of. Recently, I used Vercel v0 to build a fully functional, visually pleasing UI layout in under an hour. Vercel v0 goes beyond snippets or corrections as it is integrated into an immediately deployed rendering so you can see what the UI looks like, closing the feedback loop without a cumbersome deploy, review, fix, cut/paste cycle. It was very much a "v0" and I (or a colleague actually) needed to make extensions to hook it up and complete it.

Right off the bat, I had to use another AI (Claude) to refactor the monolithic file: but the GUI looks great and I got there despite my lack of GUI coding chops.

Tools like gpt-4o and Claude are also (moreso) now an integral part of my workflow, and are accurate enough that their generated code works on the first pass. Both are like skilled junior to mid-level developers, rapidly generating code that works, though often without an eye towards structure or insightful design themes. Great code needs to be anchored in a few guiding principles or themes, and figuring out those themes is currently beyond the AI tools.

Because the tools incorporate such wide arrays of best practice, they are also good at adding solid error handling, structure, formatting, and cleaning up code when asked. (When not asked, they tend, like the aforementioned junior developer, to produce code that works, but lacks some of these extras.)

Comparison of Specific AI Models

Claude 3.5 vs. GPT-4o

Claude 3.5 and GPT-4o each have strengths that make them suited for different coding scenarios. Claude excels in speed, synthesis, and best-practice application. When a quick fix is all I need, Claude is reliable, finding most solutions without delay. It’s particularly adept with syntactic accuracy and formatting, which adds a layer of polish to quick fixes and small functions.

GPT-4o, meanwhile, has an edge when it comes to architecture and cohesion, but seems to miss more details. For example, when I had a .tsx import issue that stumped GPT-4o, Claude fixed it almost immediately. Still, GPT-4o remains my go-to for larger, structural suggestions—it grasps the interplay between modules and classes, suggesting designs that keep code elegant and adaptable.

Another AI failure was when I was trying to get some components into a Vercel-generated GUI using Claude 3.5. Claude kept generating vast amounts of code which was actually almost identical to the popular shadcn components I was already using. It should have advised me on getting new and better components, not try to re-invent the wheel (which is it capable of doing!). If I had not caught and corrected it, I’d end up far too much AI-generated code rather than a small set of dependencies on a standard library.

These examples highlight the value of having a “toolbox” of AIs: each model brings its own edge, and some surprises.

It depends….

So no clear winner among the AIs. At least to me, right now. A bag of anecdotes, and a landscape of competing, shifting tools to sort through anew every month as they evolve. Sorry!

If I were to “score” them, I’d give Claude points for agility and correctness and GPT-4o for structural insight, with each adding unique value that makes it worth the time to try both when you’re not sure which tool is ideal for a job, or one tool is flailing.

Specific IDEs: Cursor vs. GitHub Copilot

Cursor has taken the lead as my preferred tool over GitHub Copilot. Cursor handles multi-line edits deftly, recalls my recent work as it watches me type across files, and brings in contextually relevant suggestions based on other files than the one I’m working on. E.g. Cursor will suggest new methods from one file when I’m working on another. For reasons that I am a little unclear on, or have forgotten, Cursor outshines Copilot, likely due to pulling in more context and making more edits across lines throughout a file, rather than only at the exact spot I’m editing.

And this may be wrong by the time you read it: AI in coding is evolving at a breakneck speed, and new features such as ChatGPT’s file upload and “code” options may already have challenged Cursor’s lead.?

Feedback loops—like Vercel’s instant rendering and potentially AWS Q’s options (which I haven’t tested yet)—suggest we’re heading toward more specialized, interactive AI designed to iterate, test, and respond in real time. These targeted AIs promise to make programming even faster and easier, if you can guide it well.

Debugging Challenges and AI Limitations

Common Issues in AI Debugging

For all its strengths, AI sometimes still struggles with debugging, likely most of all when hitting an issue that is not already widely documented and present in the AI’s training content. AI models are superb at repeating and consolidating conventional patterns but struggle with novel or non-standard issues, and sometimes it seems I’m getting the “greatest hits” of error responses, rather than a tailored solution that reasons through failed approaches already tried.

A challenge I encountered repeatedly with the popular AI frameworks LangChain and LlamaIndex, for instance, is that AI often suggested outdated imports, demonstrating how rapidly changing libraries can confuse even the latest models’ training approach. (This observation is already 2-3 months old, however.) Metaphorically, the existing code for both frameworks was put into a semantic blender (LLM training data) and the resulting code-smoothie therefore includes bits and bobs from different versions.

On a different note, Cursor recently introduced a debugging feature, which I haven’t yet tested,yet. My impression is that it helps with the dynamic process of debugging, keeping us, the humans, in the loop rather than statically analyzing code and error responses for issues.

More AI Tools and Specific Features

ChatGPT and Claude with Code Capabilities

So as noted above, both OpenAI’s gpt-* models and Claude are effective for code generation and debugging. Cursor currently wins out over Copilot. They all do a fantastic job, and their strengths and quirks differ.?

Other Tools to Explore

Beyond these which are in my toolbox this month, a spate of new tools are already here.?

The gpt o1-preview model has worked better for me when an overall approach or high-level architecture are required, while Chat-GPT’s “code” option and file upload feature have provided contextual guidance in my limited initial use. I have not yet configured or tried Cursor’s AI-assisted debugging, but if it is human-assisted dynamic observation of the program, that sounds like a brilliant idea. So keep your eyes open and try your own combinations.?

Multi-File Editing Techniques

Combining Code for Better Context

One technique or trick that’s served me well is combining multiple files into a single, consolidated file for the AI to process. By merging a set of related code (a bunch of files) into one file organized into blocks prefixed with the original filename, I can give the AI a broader yet focused context that yields good recommendations. Adding separators and other organizational cues helps the AI “see” relationships across files, improving its grasp on interdependencies.

My guess is that GPT canvases and Claude Projects already do a better job at making a group of files available to the AI, but it will be a while before I can evaluate the differences.

Conclusion: Current State and Future of AI in Coding

Overall, the progress AI has made in coding, architecture and debugging is remarkable. In most cases, AI now writes code that works the first time, which I can then refine as needed. As the field races ahead, I expect the current frontier - issues with complex structures and debugging subtleties - to be addressed as well.

Despite an occasionally narrow rather than holistic view, AI is already an absolute game changer.?

Knowing how and when to use AI, how to phrase the prompts, use and compare tools is perhaps the most important coding skill today. Too often I see people struggle with a problem, when they should instead be thinking primarily of how to describe the problem to the right AI tool.

A meta-note on how this blog was written (with AI help, of course)

The overall process was

write -> structurally refine -> rewrite -> tweak style and language        

And the more specific steps, that alternated between my work and AI help were:

Write (human) -> extract outline (AI) -> revise outline (human) 
  -> rewrite (AI) -> writing style guidance and rewrite (human) -> rewrite (AI) 
    -> final edit (human).        

It seems long, but was actually an organic back and forth that was quick and easy.

This process illustrates a form of chain-of-thought reasoning: by structuring the outline, refining it, and then rewriting based on a solid framework, each step logically prepared the ground for the next. It was also a human-in-the-loop approach—this allowed me to guide the AI by focusing on structural questions first, without diving into specifics that might have distracted or confused us both. Finally, it broke up the process into simpler, or more focused sections differentiating structural organization work from writing from language and style tweaking.

A detailed history of writing this blog:

In detail, then:

I wrote the first version pretty quickly - getting the ideas and experiences on the screen.

I then shifted to strucutral work. First asking AI to extract the existing outline, even if it was a little jumbled. It was rough enough that I asked AI for an initial restructuring of the outline (not the text!).

I reviewed and manually tweaked the revised outline (so still in the structural phase) to ensure it emphasized my themes and priorities. I went beyond editing a traditional “outline” and considered the outline as a long instruction prompt to AI about how to restructure the original content, so I would sometimes add non-outline hints or comments such as “(be sure to tie this back to how fast AI is changing).”?

Structural and organizational work done, I then asked AI to rewrite the original text using the new outline.?

Predictably, the output was anodyne and corporate. (E.g. currently AI will NEVER say “anodyne.”) So I prompted it to rewrite using the ideas I have learned from Steven Pinker’s excellent “The Sense of Style” writing treatise: be direct, use interesting words when they help rather than harm, repeated structures with three claims and similar cadence, etc.

The results of that were ok, but also (again predictably) somewhat awkward and corporate. I suppose AI has been trained on so much corporate, academic and otherwise stilted writing that it just can’t help it. The AI-generated text was ok, but also sounded like a sophomore trying to get a better grade by using a thesaurus to find and insert odd words.

A final rewrite was sorely needed (plus this post-blog meta note) and now:

“Claude’s your uncle!”

Excellent post, Damon Feldman ! We are also using v0 and Cursor and I highly recommend you watch an AI Jason video or this one https://m.youtube.com/watch?v=2PjmPU07KNs&t=41s&pp=ygUIQWkgamFzb24%3D as he has a method for providing prompt instructions that provide more context to the LLM . I hope it helps and allows with the combining files workaround.

Rhonda Freeman

SVP, ENGINEERING | Executive Leadership | AI/ML & Cybersecurity Innovation | Data & Enterprise Architecture | DevOps | IT, Product & Program Management | Ballroom Dancer ????

4 个月

I appreciate your thorough, easy to read, insights. I most appreciate that you are clear that these are tools that support you. They don’t replace you. You must still think, create, and review the results.

要查看或添加评论,请登录

Damon Feldman的更多文章

社区洞察

其他会员也浏览了