登录查看更多内容

Imitation Game: LLM Code Generators and LQ

Mark Seery

Founder & Principal - B2B Tech Product/Business Strategy & GTM

发布日期: 2025年3月16日

Introduction: A Game of Imitation

Every time we throw a new level of complexity at LLMs we get unpredictable results, and experience a learning curve. A year ago it was reliably formatted output, now it is software components interacting with each other. In using one of the more recent environments, WindSurf, I have discovered something else, wrangling the default behaviors of the environment which sometimes conflict with what I want. In addition, even my previous instructions to the enviornment box me into a corner that is hard to get out of.

For the first time using LLMs I experienced fear about what runaway LLMs might lead to. When there actions are driven by a desire to "help" that help can be hidden, hard to understand, and the cause of a working application being totally screwed up. As we throw more complexity at LLMs we will, IMO, need to have the suppliers of LLMs expose the default intended behavior, and the interfaces to influence it/control it. When a LLM gets on a roll of multiple changes, it can quickly get a system in a state that is hard to understand, and worse, the LLM can go into debugging loops that chew up time and money.

While Alan Turing is perhaps popularized for a test of whether a machine acts like a human, he is also remembered for the store, executive unit, control paradigm of a machine, and also for something that was distilled in the movie that has the same name as his famous paper - Imitation Game; the idea that the right question is not whether machines think, but how they think.

The entire world is adjusting to a computing paradigm that does not follow the exacting rules-based approach of computing languages to one that is sometimes unpredictable and increasingly driven by behaviors we might not be aware of. The task of a modern coder, or a modern coding enviornment, is to work out how the LLMs they are working with think, and adjust to that. Not being able to has significant consequences for the individual coder, the industry, and perhaps even humanity itself.

In this article I discuss some of the ways I managed to tame the "beast" a little. Not perfectly, but enough to help me feel more in control and less the victim of random acts of kindness.

There Is No Intelligence Magic That Eliminates Specs

The term "AI" may lead us to believe we can do complex projects without specs, validation tests, unit tests, integration tests and other historical development practices. Hey, the AI is smart, it will work it out. NOOOOOO. If you cannot get an AI to adhere to a spec, then in the middle of a perfectly working app, it will create some new code, decide the rest of the code does not interface with the new code well, and change all the parts of the App that were already working - followed by an infinite loop of debugging exercises that makes the whole codebase difficult to undersand and impossible to work together.

However, it is not as simple as that, specs have to be in a form that LLMs can easily, accurately, and repeatably digest. Those that are well down the path of pipeine automation may already have this, but, it is a point worth making. What will work best for LLMs.

Far from eliminating the need for specs and contracts between components, I would say they are even more essential with LLMs that may lack some of the experience and intuitions of a human. You can of course always get the LLM to create the specs and you can review them. The trick then is to get the LLM to consistently adhere to them.

Knowing When Enough is Enough

When you do get in one those debugging rabit holes with a LLM, you need to know when you have spent enough of your life on an issue that the LLM is never going to resolve.

Sometimes I will say something like "Step back, and consider the fundamental issues". LLM-based coding environments can get so fixated on the minutia, that they never look at the bigger picture unless you ask them to. Sometimes this works and sometimes it does not. You may have delete the function or service and start again. If you don't have enough architecture, design, and specs in place, you may even have to delete the app and start again.

Whatever your approach, you have to realize that your life is disappearing before your eyes on an issue that is never going to be resolved. You have to be displined about giving the LLM a reasonable number of tries, but changing direction when that is not going to work.

The Challenge With Advanced Behaviors

LLMs cannot think like humans, yet. The newer coding environments are clearly implementing some type of default behaviors, by system prompts, memories, and mechanisms unknown to me. All i know, is sometimes I have to wrangle with them.

It is really nice to have a helper that goes ahead and checks current files etc and does a whole bunch of things without asking you. Really nice until that moment when you realize the LLM has run riot based on a bad interface implementation it created, or did something else you did not know it was going to do. It is at times like this that you go into angry parent mode "Don't do anything without asking me first!!!" That is not sustainable for a parent or a manager. In addition, you box yourself into a corner that bites you later. You have to find a middle ground. I give my LLMs directions like "XYZ is a core source of truth, don't change it without asking me". Guardrails around the important stuff.

There is never any guarantee a LLM won't lose the plot, but you can limit how often, and also limit the damage, so on net, your overall productivity is better. Enough control and enough autonomy that you are on net better off.

Why Are You Doing That!!!

Sometimes fustration occurs when a LLM keeps doing something you have told it not to do. In those moments, when you realize you have told it a million times without impact, you have to dig for deeper answers.

When working with humans, it is sometimes helpful to know why someone is acting in a certain way, and the same is true with LLMs. Often, when I ask "why" I get one or more rounds of apologies, which I respond with "I don't want an apology, I want to know why ... " Those moments can be very revealing, for example a development enviornment might differentiate between a memory and a persistent memry, or you might learn something about the default behavior installed in a development environment and how you are fighting it - more important how you might override it now that you know about it.

Don't forget to ask "why" there is much to be learned when you do, whether dealing with humans or LLMs.

Conclusion

The success of LLMs is leading to them being used for increasingly complex requests, not just code snippets any more, but multiple components working together, perhaps an entire application/system. That means a new learning curve, new dangers, and a new period of uncertainty until we tame the LLM-based environments designed to be "helpful".

The quesion is not whether LLMs think, but how they think. While the old adage that LLMs are just "next token predictors" may be useful for those engineering foundation models, it is not the experience the rest of us will increasingly be having. We will be experiencing LLMs that automatically do many things in response to one prompt, and who have default behaviors embedded in them.

In the human world, we recognize some people as being good at working with other humans. We refer to them as having high emotional intelligence (EQ). In the new world of coding, I am betting people with high LQ (LLM quotient) will be the ones that are most productive using LLM-based development environments.

Tech @markseery

298 位关注者

要查看或添加评论，请登录

Mark Seery的更多文章

Catching the Musk Falling Knife

2025年3月21日

Catching the Musk Falling Knife

Many years ago, I reposted an article about Elon Musk. I received a comment on the post that commented on a) the money…

1 条评论
Who owns the pricing response to deep discounters?

2025年3月4日

Who owns the pricing response to deep discounters?

Introduction As often is the case when playing with AI, its value to me is the thinking it stimulates instead of a…

2 条评论
DOJ has been investigating HPE/Juniper for 12 months - 250227 5:25-cv-00951

2025年3月1日

DOJ has been investigating HPE/Juniper for 12 months - 250227 5:25-cv-00951

Not much happened this week in the DOJ case against HPE/Juniper. The parties have started discovery requests.

2 条评论
AI & Other Tsunamis: Don't Throw Stones at Barking Dogs

2025年2月27日

AI & Other Tsunamis: Don't Throw Stones at Barking Dogs

Introduction: The Churchillian Wisdom "As someone said, you will never get to the end of your journey if you stop to…
The Internet: Not a Choice, But a Necessity of Scale

2025年2月24日

The Internet: Not a Choice, But a Necessity of Scale

Preface The below tells in story format these principles: Complex systems like the Internet require intermediaries to…
No quick trial for HPE & Juniper: 250222 Update - 5:25-cv-00951-PCP

2025年2月22日

No quick trial for HPE & Juniper: 250222 Update - 5:25-cv-00951-PCP

Case management meetings have occurred, and the bottom line is HP/Juniper want a mid-year trial date fearing this…

1 条评论
Human vs. AI: How We Think, Learn, and Understand Differently—and Why It Matters

2025年2月22日

Human vs. AI: How We Think, Learn, and Understand Differently—and Why It Matters

I received some feedback that "Human Superiority in Reasoning, Knowledge, and Understanding over LLMs - Consequences…
The Unfolding Map of AI and Authority in a Restless Age

2025年2月21日

The Unfolding Map of AI and Authority in a Restless Age

Preface As I have previously indicated I am developing a list of information processing Axioms that I use to guide the…
The Invisible Forces of Change: A Story of Information, Incentives, and Human Nature

2025年2月20日

The Invisible Forces of Change: A Story of Information, Incentives, and Human Nature

The below is generated by: Providing two different LLMs a list of principles about information. Having each of them…
Routing Protocol insights from Information Axioms

2025年2月20日

Routing Protocol insights from Information Axioms

Introduction Applying the update to Information Axioms to routing protocols led to some interesting assertions. The…

See all articles

Introduction: A Game of Imitation

There Is No Intelligence Magic That Eliminates Specs

Knowing When Enough is Enough

The Challenge With Advanced Behaviors

Why Are You Doing That!!!

Conclusion

Tech @markseery

298 位关注者

Mark Seery的更多文章

Catching the Musk Falling Knife

Who owns the pricing response to deep discounters?

DOJ has been investigating HPE/Juniper for 12 months - 250227 5:25-cv-00951

AI & Other Tsunamis: Don't Throw Stones at Barking Dogs

The Internet: Not a Choice, But a Necessity of Scale

No quick trial for HPE & Juniper: 250222 Update - 5:25-cv-00951-PCP

Human vs. AI: How We Think, Learn, and Understand Differently—and Why It Matters

The Unfolding Map of AI and Authority in a Restless Age

The Invisible Forces of Change: A Story of Information, Incentives, and Human Nature

Routing Protocol insights from Information Axioms

社区洞察