Imitation Game: LLM Code Generators and LQ
Image Source: Mark Seery with Midjourney

Imitation Game: LLM Code Generators and LQ

Introduction: A Game of Imitation

Every time we throw a new level of complexity at LLMs we get unpredictable results, and experience a learning curve. A year ago it was reliably formatted output, now it is software components interacting with each other. In using one of the more recent environments, WindSurf, I have discovered something else, wrangling the default behaviors of the environment which sometimes conflict with what I want. In addition, even my previous instructions to the enviornment box me into a corner that is hard to get out of.

For the first time using LLMs I experienced fear about what runaway LLMs might lead to. When there actions are driven by a desire to "help" that help can be hidden, hard to understand, and the cause of a working application being totally screwed up. As we throw more complexity at LLMs we will, IMO, need to have the suppliers of LLMs expose the default intended behavior, and the interfaces to influence it/control it. When a LLM gets on a roll of multiple changes, it can quickly get a system in a state that is hard to understand, and worse, the LLM can go into debugging loops that chew up time and money.

While Alan Turing is perhaps popularized for a test of whether a machine acts like a human, he is also remembered for the store, executive unit, control paradigm of a machine, and also for something that was distilled in the movie that has the same name as his famous paper - Imitation Game; the idea that the right question is not whether machines think, but how they think.

The entire world is adjusting to a computing paradigm that does not follow the exacting rules-based approach of computing languages to one that is sometimes unpredictable and increasingly driven by behaviors we might not be aware of. The task of a modern coder, or a modern coding enviornment, is to work out how the LLMs they are working with think, and adjust to that. Not being able to has significant consequences for the individual coder, the industry, and perhaps even humanity itself.

In this article I discuss some of the ways I managed to tame the "beast" a little. Not perfectly, but enough to help me feel more in control and less the victim of random acts of kindness.

There Is No Intelligence Magic That Eliminates Specs

The term "AI" may lead us to believe we can do complex projects without specs, validation tests, unit tests, integration tests and other historical development practices. Hey, the AI is smart, it will work it out. NOOOOOO. If you cannot get an AI to adhere to a spec, then in the middle of a perfectly working app, it will create some new code, decide the rest of the code does not interface with the new code well, and change all the parts of the App that were already working - followed by an infinite loop of debugging exercises that makes the whole codebase difficult to undersand and impossible to work together.

However, it is not as simple as that, specs have to be in a form that LLMs can easily, accurately, and repeatably digest. Those that are well down the path of pipeine automation may already have this, but, it is a point worth making. What will work best for LLMs.

Far from eliminating the need for specs and contracts between components, I would say they are even more essential with LLMs that may lack some of the experience and intuitions of a human. You can of course always get the LLM to create the specs and you can review them. The trick then is to get the LLM to consistently adhere to them.

Knowing When Enough is Enough

When you do get in one those debugging rabit holes with a LLM, you need to know when you have spent enough of your life on an issue that the LLM is never going to resolve.

Sometimes I will say something like "Step back, and consider the fundamental issues". LLM-based coding environments can get so fixated on the minutia, that they never look at the bigger picture unless you ask them to. Sometimes this works and sometimes it does not. You may have delete the function or service and start again. If you don't have enough architecture, design, and specs in place, you may even have to delete the app and start again.

Whatever your approach, you have to realize that your life is disappearing before your eyes on an issue that is never going to be resolved. You have to be displined about giving the LLM a reasonable number of tries, but changing direction when that is not going to work.

The Challenge With Advanced Behaviors

LLMs cannot think like humans, yet. The newer coding environments are clearly implementing some type of default behaviors, by system prompts, memories, and mechanisms unknown to me. All i know, is sometimes I have to wrangle with them.

It is really nice to have a helper that goes ahead and checks current files etc and does a whole bunch of things without asking you. Really nice until that moment when you realize the LLM has run riot based on a bad interface implementation it created, or did something else you did not know it was going to do. It is at times like this that you go into angry parent mode "Don't do anything without asking me first!!!" That is not sustainable for a parent or a manager. In addition, you box yourself into a corner that bites you later. You have to find a middle ground. I give my LLMs directions like "XYZ is a core source of truth, don't change it without asking me". Guardrails around the important stuff.

There is never any guarantee a LLM won't lose the plot, but you can limit how often, and also limit the damage, so on net, your overall productivity is better. Enough control and enough autonomy that you are on net better off.

Why Are You Doing That!!!

Sometimes fustration occurs when a LLM keeps doing something you have told it not to do. In those moments, when you realize you have told it a million times without impact, you have to dig for deeper answers.

When working with humans, it is sometimes helpful to know why someone is acting in a certain way, and the same is true with LLMs. Often, when I ask "why" I get one or more rounds of apologies, which I respond with "I don't want an apology, I want to know why ... " Those moments can be very revealing, for example a development enviornment might differentiate between a memory and a persistent memry, or you might learn something about the default behavior installed in a development environment and how you are fighting it - more important how you might override it now that you know about it.

Don't forget to ask "why" there is much to be learned when you do, whether dealing with humans or LLMs.

Conclusion

The success of LLMs is leading to them being used for increasingly complex requests, not just code snippets any more, but multiple components working together, perhaps an entire application/system. That means a new learning curve, new dangers, and a new period of uncertainty until we tame the LLM-based environments designed to be "helpful".

The quesion is not whether LLMs think, but how they think. While the old adage that LLMs are just "next token predictors" may be useful for those engineering foundation models, it is not the experience the rest of us will increasingly be having. We will be experiencing LLMs that automatically do many things in response to one prompt, and who have default behaviors embedded in them.

In the human world, we recognize some people as being good at working with other humans. We refer to them as having high emotional intelligence (EQ). In the new world of coding, I am betting people with high LQ (LLM quotient) will be the ones that are most productive using LLM-based development environments.

要查看或添加评论,请登录

Mark Seery的更多文章

社区洞察