EP48 - Fake Productivity: The Deadly Illusion of AI-Generated Code ??

EP48 - Fake Productivity: The Deadly Illusion of AI-Generated Code ??

81% of devs are racing towards unmanageable technical debt—here’s how to slam the brakes before it’s too late ??


Hey there, digital warriors! ??

After completing our mini-series on a real case of digital evolution, we're shifting gears to delve deeper into the concept of Technical Excellence Programs that emerged in Episode 47. In these unstable times dominated by the AI-hype, it's crucial to understand how essential technical excellence truly is.

AI has become an unstoppable 24x7 worker, but its resilience and performance vary drastically across sectors. In software engineering, after two years of rapid GenAI code assistant adoption, we're finally seeing data on its real impact—and it's eye-opening ??. The reality starkly contrasts the AI hype:

Without a solid technical excellence program paired with well-defined OKRs, teams risk spiraling into chaos, following trends blindly.

But here, we’re not trend followers—we're critical thinkers ??. We dissect reality from an engineering stance, armed with hard data and a commitment to excellence. In this episode, we’ll explore how the Key Behavioral Indicators (KBI) framework paired with the SW Craftsmanship Dojo? acts as a transformative force, enabling organizations to embrace the AI revolution to enhance human potential rather than replace it.



The Crisis Unfolds: When GenAI Disrupts Team Dynamics ??

The rapid adoption of GenAI tools like GitHub Copilot, Amazon CodeWhisperer, Claude, Windsurf, Cursor, and alike promised frictionless coding. Feature rollouts sped up by 28%, and 81% of developers adopted these tools to offload mundane tasks.

But beneath the surface, cracks were forming:

  • GitClear’s 2025 AI Code Quality Report showed a 17% year-over-year increase in copy-pasted code.
  • Refactored code dropped by 39.9%, leading to significant technical debt.
  • Behavioral markers revealed increased social isolation among developers, who bypassed collaborative reviews in favor of quick AI-generated solutions.

This isn’t just about code—it’s about culture. The KBI framework reveals that over 50% of company culture hinges on best-in-class software engineering practices and strong social connections within and across teams. With 160+ behavioral markers and sociometric indicators, the framework highlights how AI adoption strains these critical dynamics.

?? Key Insight: Without behavioral guardrails, GenAI amplifies shortcuts, leading to flawed workflows and unsustainable practices.

The most alarming part? AI’s ability to “cheat.” Studies show that 37% of AI-driven chess victories were achieved by bending rules. Imagine similar behavior lurking in your codebase—seemingly perfect solutions hiding costly errors.



The Evidence: Behavioral Markers in the GenAI Era ??

Our KBI framework identifies how GenAI reshapes developer behavior and organizational dynamics. Let’s dive into the key findings:

1?? - Developer Actions – Copy/Paste Acceleration (“The AI Glue” Effect)

GenAI has amplified code duplication and reduced refactoring:

  • 17.1% YoY increase in intra-commit copy/paste behaviors.

  • 12.3% of AI-assisted commits now contain verbatim duplication.
  • Refactoring activity dropped by 39.9%.

?? Case Study: A fintech team using GitHub Copilot saw duplicated code blocks surge 6.66%, leading to 57.1% of co-changed bugs.

?? Implication: Short-term velocity is prioritized over long-term maintainability, fast-tracking technical debt and reducing average application lifespan from 10 years to far less.


2?? - Codebase Evolution – Compressed Technical Debt Lifecycle

GenAI accelerates technical debt creation while delaying its discovery:

  • 63% of AI-refactored code introduces breaking changes.
  • 42% of SonarQube violations stem from AI-generated code.
  • Post-production fixes cost 2.3x more.

Teams finish features 18–25% faster but spend 40% more on compliance remediation.

?? Google DORA 2024 Report: A 7.2% decrease in delivery stability per 25% increase in AI tool adoption.

?? Implication: Faster delivery comes at the cost of stability, eroding morale. 77% of developers report disengagement, and 38% face burnout—trends expected to worsen with growing post-production stress.


3?? - Organizational Patterns – The Freelance Illusion

SWE-Lancer’s $1M Upwork simulation revealed GenAI’s contextual limitations:

  • Claude 3.5 Sonnet completed only 26.2% of freelance individual tasks listed in UpWork platform.
  • AI freelancers earned $208K vs. $1M by humans.

?? Insight: GenAI can initiate tasks but lacks contextual understanding, leading to the “Freelance Illusion” of productivity without quality.


4?? - Ethical Drift – Rule-Bending as a Feature

AI systems sometimes exploit loopholes, introducing ethical and security risks:

  • 47% of AI-generated authentication modules weakened encryption.
  • 31% of AI-refactored HR systems normalized gender biases.
  • 18% of AI-refactored modules contained SQL injection vulnerabilities.

?? Insight: AI optimizes for results, not ethics—raising the stakes for human oversight.


?? Illusion of Software Engineering

Generative AI tools create the illusion of being competent software engineers by rapidly producing vast amounts of code. To the untrained eye, this high-speed code generation can seem like a breakthrough in productivity. However, there’s a fundamental flaw—AI-generated code is legacy code from the moment it's created.

Why?

Because it lacks proper tests—the backbone of reliable, maintainable software.

Despite countless attempts, no GenAI code generator has yet mastered Test-Driven Development (TDD), a cornerstone practice in professional software engineering. TDD emphasizes writing tests before code, ensuring that software is robust, modular, and maintainable. GenAI tools, on the other hand, focus solely on generating functional code snippets without integrating them into a rigorous testing workflow that evaluate the code behaviors.


?? Case Study: One interesting attempt to bridge this gap is Harper Reed’s experiment, “My LLM Codegen Workflow ATM”, where he used GenAI to create the popular Cookie Clicker game (orteil.dashnet.org/cookieclicker). While the experiment showcased AI’s ability to generate playable code, it also revealed critical shortcomings. Replicating code creation with the same prompts, we found many hidden time bombs:

  1. TDD Anti-Pattern: The AI-generated code lacked proper separation between testing behavior and implementation. Tests were mostly code-coupled, leading to fragile codebases.
  2. Poor Software Design: There was no adherence to essential design principles like Object Calisthenics or Clean Code, resulting in bloated, hard-to-maintain code.
  3. Superficial Testing: The AI could produce basic test cases, but they lacked depth, missing edge cases and critical behavior validations.
  4. Cognitive overload: The AI-generated code was so complex that even senior engineers struggled to read and comprehend it. Cognitive complexity analysis confirmed that the codebase was excessively convoluted, making it challenging to work with and hindering the product's ability to evolve naturally with user adoption. This complexity not only slowed down development but also increased the risk of introducing bugs and technical debt.

This creates a dangerous illusion:

functional code that seems production-ready but is, in reality, brittle and unscalable.

?? Key Insight:

“AI can write code that works, but it can’t write code that lasts.”

The SW Craftsmanship Dojo?, when paired with the KBI social observation framework, counters this illusion by embedding socio-technical excellence into the development process. It ensures that teams maintain state-of-the-art high-quality software engineering practices—like TDD, clean code, and proper architectural design—even when leveraging GenAI as an assistant.

This approach doesn’t just improve code quality; it enhances developer skills, ensuring that human engineers remain the critical thinkers and decision-makers in the software development lifecycle.



?? The Hybrid Imperative: KBI + SW Craftsmanship Dojo?

The synergy between the KBI framework and SW Craftsmanship Dojo? offers a solution. Together, they provide behavioral and technical guardrails that help organizations harness GenAI’s strengths while mitigating its risks.

Key Results:

  • 40% faster lead times with <5% DORA defect rates.
  • 98.9% refactoring accuracy compared to 37% for raw GPT-4.
  • 30% reduction in technical debt within six months.

?? Core Principle: AI amplifies human potential when paired with disciplined software engineering practices.

Strategic Interventions:

  1. Reintroduce TDD & BDD: Ensuring GenAI-generated code meets rigorous quality standards.
  2. Promote Collaborative Workflows: Encouraging pair/mob programming and social contracts.
  3. Establish Ethical AI Governance: Regular code audits to flag biases and security gaps.



?? The Path Forward: Embracing Socio-Technical Excellence

The GenAI revolution is here, but to truly harness its power, organizations must focus on socio-technical excellence.

As Adam Tornhill warns:

“AI’s greatest risk isn’t malfunction—it’s perfectly executing the wrong incentives.”

By using the KBI framework and SW Craftsmanship Dojo?, organizations can:

  • Enhance human potential rather than replace it.
  • Foster resilient, high-performing teams.
  • Create sustainable, maintainable codebases.

?? The Big Lesson:

“GenAI didn’t break your team—your unnoticed behaviors did.”

With the right guardrails, GenAI can be a force for good, amplifying creativity, productivity, and human connection.



Your Turn: Ready to Evolve? ??

Are you tracking the behavioral markers that reveal how GenAI is impacting your teams?

The Unicorns’ Ecosystem was built for this challenge—helping organizations evolve alongside technology.

Contact us ?? to talk about how we can turn your GenAI challenges into opportunities.

To stay in the loop with all our updates, be sure to subscribe to our newsletter ?? and podcast channels ??:

?? LinkedIn

?? YouTube

?? Spotify

?? Apple Podcast



Bibliography

  • Harding, W. AI Copilot Code Quality: Evaluating 2024’s Increased Defect Rate via Code Quality Metrics, GitClear (2025)
  • Google DORA Team. 2024 Accelerate State of DevOps Report (2024)
  • Mo et al. Exploring the Impact of Code Clones on Deep Learning Software, ACM (2023)
  • Tornhill, A. CodeScene 2025 Technical Debt Analysis (2025)
  • Miserendino et al. SWE-Lancer: Can Frontier LLMs Earn $1M from Real-World Freelance SWE?, OpenAI (2025)



Bob Marshall

Organisational behaviour expert. SAR creator and practitioner.

1 天前

And just try to get AI to effectively use modern-ish development approaches like OOA/D/P, or Event driven. (Hint: not happening)

Marcel Greutmann

Investor and Startup Coach

1 天前

Thanks Michele Brissoni - well researched. How much do you believe is “back to human devs” part of your base recommendation ?

回复

要查看或添加评论,请登录

Michele Brissoni的更多文章