AI–Driven Development
"AIDDer" (C) Peter Merel 2024 with a little help from Dall-E 3 & Runway-ML

AI–Driven Development

AI writes software faster, cheaper, and better than humans, but it also hallucinates and misinterprets us. Agile teams remain essential to keep AI on the road even when they no longer drive.

There's no such thing as straight and level. [While] things seem to be going perfectly, you don't take your eyes off the road. You must always be prepared to move a little this way, a little that way. Sometimes a completely different way. -- Kent Beck

Human developers hallucinate too. We just call their hallucinations bugs. That's why Beck's test-first, test-driven, continuous integration became the critical enabler for high-performing Agile development. Without his XP practices, bugs drove the costs of change exponential on Agile projects. With them, these costs went linear. That math was what truly made "Agile" agile.


TDD & BDD linearize the costs of change

The Three Amigos + 1

XP pioneered Test-Driven Development, which turned the traditional Analyze-Design-Build-Test workflow on its head: Analyze-Test-Build-Design. In TDD, Analysis generates executable Tests, developers Build the simplest solution that passes these, then Design by refactoring all their code to form the simplest system that still passes all old and new executable tests. Beck called the rinse-and-repeat cycle of this workflow "Continuous Integration".

A decade after XP, Behavior Driven Development was coined by Dan North and popularized in John Ferguson Smart's books. BDD elevated the TDD workflow from code functions to product behaviors. It introduced Gherkin, a simple natural language format that made business specs executable as tests – along with tools to automatically visualize the compliance of the specs with actual system behaviors. BDD also introduced the famous "Three Amigos" workflow where analysts, testers, and developers collaborate on Gherkin acceptance criteria to drive new product behaviors.

Analyst, Tester, Developer, and AI ...

Gherkin makes a natural way to align technical and non-technical people with AI too. We still need the human amigos to review BDD test-step implementations, solution code and integration choices to make sure they meet business intents. But Gherkin enables humans and AI to align on product behavior in ordinary human language, and evolve this alignment iteratively while maintaining linear costs of change.

AIDD Workflows

Three Amigos + 1

Each step of this AIDD workflow involves collaboration between AI and BDD's human amigos. The humans still collaborate with each other too. While every step in the Three Amigos workflow could in theory be generated by an AI agent alone, without humans to reality-test AI outputs and collaborate on product design decisions these loops turn into a hallucinatory positive-feedback generator. Like the howl of a speaker plugged into its own microphone ... Conway's Law gone mad.

But this first picture doesn't cover the really massive potential of AIDD to surpass BDD workflows because it doesn't consider meta level collaboration between the human amigos and AI agents. That's to say, each of these humans can have an AI agent of their own, which they constrain with specs on HOW to produce Gherkin specs relevant to their function. We still need the humans in the loops for the same reasons as above, but this way the pairing of amigos with AI agents becomes enormously more productive.

Three Amigos + 4

In later articles in this series we'll see how we can focus development teams on ever-changing throughput constraints through breadth-first, scientific design patterns pioneered by the XSCALE Alliance. These patterns economically derive per-feature acceptance criteria from business constraints by way of the critical Agile/XP principle of YAGNI: You Aren't Gonna Need It. And this is how we break the mad self-reinforcement AI loop - by forcing all the AI agents to fit their work to a growing body of Three Amigos Gherkin constraints.

AIDD for Legacy Systems

Even when legacies lack adequate documentation and tests, AI can rapidly generate Gherkin regression suites based on the history of the legacies' codebase as correlated with historical logs of system behavior along with historical kanban and email archives that track the human conversations that produced these things. While actually saying that out loud sounds mind-boggling to us olde-timey developers from the 2010s, such analysis is clearly within capabilities GPT demonstrates today.

IQ 152, passes the legal bar, gets top marks in medical boards, etc., etc. And does it all in 10 seconds flat.

Where such logs and histories aren't available or reliable, AI can still rapidly and cheaply generate probes into live systems to capture both UI transactions and the resulting database transactions en-masse. By correlating these with each other in a RAG LLM it can generate Gherkin specs that express the key invariant business relationships between the behaviors of these probes.

Performing such analysis is prohibitively expensive for humans. While we can clearly train AI to do it, the qualities of AI execution for such purposes remain largely untried. In any case, these generated Gherkin specs will require systematic review by our human amigos to assure they cover over all business intents and unhappy-path behaviors without AI hallucinations getting baked into the spec-step implementations Ken-Thompson-style.

Such human reviews will be much more expensive than AI generation of Gherkin regression suites, but still enormously cheaper than continuing to employ humans to maintain legacy codebases. Enabling AIDD to systematically replace these with modern test and solution code effectively linearizes their costs of change too. Indeed this same regression-test automation is applied to commercial AI models themselves to keep their behaviors within ethical and functional guidelines - despite the fact that we have zero documentation or human-readable code for them.

From AIDD to Continuous Alignment

The rightmost step in the AIDD workflows in the diagrams above automate Continuous Integration of the work products of interdependent teams. As this includes our executable but still human-readable Gherkin specs, this seems to offer a way to continuously align these interdependent teams without face-to-face communication between them.

And that's a problem.

While AI-driven alignment of work on the codebase is essential to efficient collaboration, without also aligning the human members of different teams with each other this risks generating huge confusion. We want the humans to make and moderate rational decisions about system behaviors without constantly tripping over each other's different ideas. We can't rely on AI to do this but mainstream Agile methods can't keep up with the AI flows either.

3 Amigos + 4 as workflow. Right to left flows are AI-generated, left to right are human-moderated.

In upcoming articles in this series we'll explore a range of Agile Alignment methods to solve this problem. Part of the way such continuous alignment can work is extending the shared fabric of Gherkin tests to form executable contracts for workflows between interdependent teams at every level of the organization – root and branch, not just at operational leaves

And that's why the next article in this series will start by integrating AIDD with Leadership as a Service. If you'd like to help us keep this series of articles honest, come join in at LinkedIn's new AI & Agile Alignment group.

Tom Gilb

Inventor of 'Planguage', Consultant, Methods Inventor, Textbook Writer, Keynote Speaker and Teacher to many international organizations

9 个月

OK, Peter you have at least one big problem with your thinking and that is, you keep treating the problem as though it is a coding problem, rather than a systems and business/organizational problem. This was the major fault of agile manifesto guys, programmer mentality, not a systems/organizational mentality.?You MUST move to 'SYSTEMS' & 'MULTIPLE VALUES', OR YOU WILL NOT SOLVE THE PROBLEM. See for example: Value Agile 2020 BOOK: leanpub.com/ValueAgile (PAID) BOOK (FREE) https://tinyurl.com/ValueAgileBook VIDEO https://www.gilb.com/blog/Agile-Tools-for-Value-Delivery-by-Tom-Gilb and see https://tinyurl.com/ValueManagementFolder. See also PRODUCT ENGINEERING: The Use of AI? to Improve the Use of the Systems Engineering Method 'Planguage',? for development of AI Products and Services. 13th November 2024 AIM in Oslo, 15 Minute talk https://www.dropbox.com/scl/fo/125yvnaxtb418ss90kt28/AMYd5IFE_AtOrTMjjAoAndQ?rlkey=chat6m2kz24fckrygnsj5j99y&dl=0

Mihail S.

Principal Consultant @ Pelsi Group | MBA

1 年

"... without humans to reality-test AI outputs and collaborate on product design decisions these loops turn into a hallucinatory positive-feedback generator." and "We still need the humans in the loops for the same reasons as above, but this way the pairing of amigos with AI agents becomes enormously more productive." These two statements may be signalling danger: as humans are pressed to produce more and more, they will be pressed to transfer more and more responsibilities to the AI in order to increase their productivity. There's a real possibility that someone may take themselves completely out of the "pairing of amigos with AI agents"and leave everything to AI. Here's "hallucinatory positive-feedback generator" in action. Looking forward to your future posts Peter Merel, I'm sure you've got interesting thoughts coming up on this subject. Many thanks!

Masa K. Maeda, Ph.D.

C-Suite Leadership Coach and Mentor with focus on the balance between leadership efficacy, the human factor and enterprise business excellence.

1 年

A challenge for “Agile teams remain essential to keep AI on the road” is that agile teams aren’t necessarily good navigators. Furthermore, the navigators’ navigators (managers and product owners) might not be good navigators neither. Meaning that if misinterpretations exist without AI, not knowing how to interact with AI will make the negative impact grow exponentially if we factor hallucinations in. Add to that the fact that there’ll be navigators who will trust the AI blindly and the result will be a set of problems (what Russell Ackoff calls “a mess”).

Wolfram Müller

Founder of the open DolphinUniverse community, helping organizations worldwide leverage expertise to build highly agile, productive teams, resulting in Fun & Flow—our goal: making this knowledge accessible to all.

1 年

AI helps to adress a constraint (hume live time) ... so very valuable article ... ... by the way (a little off topic) - we have sind nearly a year our own GPT with our own knowledge and it is programmed not to hallucinate --> https://DolphinGPT.ai ... it is an Expert on #Lean, #Agile and #TOC we use it to write our books - we develop the table of content and some keywords and details --> the AI write the text in any language and style we need the interesting part is that it look very similar to the workflow you described and it helped us to get clarity about our content. The fun part was - the AI has "avereage understanding" - so we learned how an average reader understands our thoughts - so we learned a lot to formulate our ideas in a way that not just the AI understands but also the average human

要查看或添加评论,请登录

Peter Merel的更多文章

  • Unbreakable Seed Storage

    Unbreakable Seed Storage

    Bitcoin users generally fail to secure their seed phrases - the keys to the blockchain wallets that hold their funds…

    12 条评论
  • Superhuman Prompting

    Superhuman Prompting

    It's well known that prompting formulas dramatically improve the quality of LLM outputs. In many cases a well-prompted…

    8 条评论
  • #AI and the #ClimateCrisis

    #AI and the #ClimateCrisis

    AI used recreationally is no more likely to solve the climate crisis than a hammer cracking walnuts will build a house.…

  • Prompting AI-Driven Development ... in Anger

    Prompting AI-Driven Development ... in Anger

    TL;DR: AI codes quicker, better, and cheaper than humans, but also does things almost but not quite entirely unlike…

  • Merel's Wager & Test-First AGI

    Merel's Wager & Test-First AGI

    Sam Altman says it's only a model's behavior that we can check, not its source code. Even if AI model weights were…

  • AI & Agile Alignment: How-To

    AI & Agile Alignment: How-To

    The Intelligence Revolution We've been paying people to automate work we used to pay people to do since agriculture…

  • AI & Agile Alignment 101

    AI & Agile Alignment 101

    There is no agility without alignment; anyone with back pain will tell you that. The Agile movement began in the 90s as…

    8 条评论
  • Camelot 2.0

    Camelot 2.0

    TL;DR: A simpler method of Autonomy in Alignment. This version of Camelot is much easier to explain and to do across…

  • Autonomy in Alignment

    Autonomy in Alignment

    Autonomy Without Alignment In a change of direction, SAFe's Dean Leffingwell quotes Jim Collins: Autonomy without…

    11 条评论
  • Descaling the Agile Movement

    Descaling the Agile Movement

    Agile frameworks combine pattern languages with bureaucracies. As pattern languages, they offer useful solutions to…

    1 条评论

社区洞察

其他会员也浏览了