How and Why AI is Eating Itself
Artificial intelligence may be on a path to self-destruction.
The problem? AI models are increasingly training on their own output.?
The Feedback Loop Dilemma: A Vicious Cycle
Data is the lifeblood of AI learning and evolution. Initially, AI models trained on vast amounts of human-generated data, resulting in accurate and diverse outputs that wowed the world.
However, as AI-generated content, like text, images, and videos, becomes more prevalent, there's an increasing risk that future AI systems will train on synthetic data generated by AI itself rather than the rich and varied human-generated content that initially made these systems effective.
Swallowing Its Own Vomit
This shift towards using AI-generated data creates a feedback loop where an AI’s output from one generation serves as the input for the next.
Think of it like this: Imagine a chef only eats the food he or she cooks and then uses that experience to create new recipes. Over time, without outside influences, the variety and quality of their dishes would diminish.?
The same is true in closed cultures.
After all, if you always do what you always did, you’ll always get what you always got.
Similarly, AI-generated data often needs further human-generated data's richness, variability, and nuance. As AI continues to ingest and train on its own output, it risks losing the quality and diversity that initially made it compelling.
Model Collapse: Sharing a Bubble of Shared Consciousness
A particularly alarming outcome of this feedback loop is what researchers call "model collapse". A paper published in Nature highlighted how this leads to a narrower range of AI outputs over time.
As AI systems become less accurate and diverse, they start mirroring a homogenised version of reality, far removed from the real world's complexity.
For instance, a study in which AI was trained to generate handwritten digits initially produced results that closely mimicked human handwriting. However, after several generations of training on its own output, the digits became blurry and indistinct, eventually converging into an unrecognisable shape. This illustrates how an AI model can degrade once it loses access to fresh, diverse, high-quality data.
It illustrates what happens when a model is repeatedly trained on its own output.
(Based on ?handwritten digits)
Research conducted by Ilia Shumailov and others featured a data set of 60,000 handwritten digits.
AI trained to mimic digits:
Set made by an A.I. trained on the previous A.I.-generated digits.
What happens if this process continues?
Following 20 generations of training, the digits blur and erode.
?
?
After 30 generations, they converge into a single shape.
Research conducted by Ilia Shumailov and others featured a data set of 60,000 handwritten digits.
?In creative fields like copywriting, this effect is particularly concerning.
In everyday use, AI typically responds to prompts with an average of the most common answers, often considered ‘best practice’.
Today’s generation of marketers, fearful for job security, understandably see best practice shortcuts as defenders of their mortgage and rent payments.
However, ‘best practices’ can quickly degenerate into ‘Lazy Practice’ (LP).? These are uninspired, repetitive responses that lack originality. As more LP content is produced, it further contaminates the data pool, leading to even denser and gummy LP content, a vicious cycle that stifles creativity and innovation.?
领英推荐
The Tyranny of Output: From Your Workplace to GP
The current obsession with output over quality is everywhere. Take university administrators. ?To receive funding, students attending class can become a priority over the quality of learning. Likewise, a high-ranking student experience can affect student applications equally.? Measuring fun becomes at least as important as the quality of teaching
In healthcare, meeting the target number of appointments can take precedence over the effectiveness of treatments.
Goodhart's Law epitomises this approach:
"When a measure becomes a target, it ceases to be a good measure."
When specific narrow metrics are used to evaluate performance, people and systems change their behaviour to improve those metrics, often at the expense of broader goals or quality.
The same dynamic is at play with AI. In industries that rely heavily on AI for generating creative outputs at the speed of a fast-food burger, AI could start producing increasingly uninspired and repetitive content because it has lost touch with the diverse and vibrant human input that initially fuelled its creativity.
This is why professional prompt engineering is crucial; it’s not just about generating large volumes of content but ensuring its quality and originality.
Practical Implications
AI model degradation is not just a theoretical concern; it has further real-world implications across various industries, including healthcare and finance.?
Imagine a medical-advice chatbot that overlooks potential diagnoses because it was trained on a limited range of medical knowledge derived from earlier chatbots. Or consider an AI history tutor that absorbs AI-generated propaganda, losing the ability to distinguish between fact and fiction.
Far-fetched?
As AI models increasingly train on their own outputs, they narrow their focus, leading to more predictable and less varied results. In critical fields, this could have severe consequences.
In finance, for example, AI systems used for trading and risk assessment might start making increasingly homogeneous decisions, amplifying market risks rather than mitigating them.
In education, AI-driven personalised learning tools could begin offering less effective and more generic advice, failing to meet the unique needs of individual students.
The Final Days of Diversity and Quality
As AI systems become more self-reliant, expect a decline in diversity and quality. One contributing factor is strict AI regulation, particularly around sensitive or potentially offensive content.
Ask an AI model a question a cautious regulator deemed potentially problematic, and you’ll likely receive a polite but resolute refusal to answer. While this might be necessary to avoid harmful or unethical outcomes, it also has the unintended consequence of narrowing the AI’s range of responses.
Regulatory caution, combined with the over-reliance on AI-generated data, forces models to revisit the same datasets repeatedly, reinforcing existing biases and limiting creativity.
No wonder working with AI in such a constrained environment can begin to feel like living under the watchful eyes of Big Brother: safe but stifling.
If the data AI is learning from is increasingly narrow or biased, models will amplify those biases, creating a feedback loop that further entrenches stereotypes and inaccuracies.
Spotting and Mitigating AI Self-Destruction
As AI evolves, its outputs become more sophisticated and harder to distinguish from human-generated content. This makes ensuring AI models are trained on high-quality, original data rather than their recycled outputs increasingly challenging.
For example, university plagiarism checkers struggle, especially in subjects like accountancy, whose exams rely on ‘fixed’ predictable answers. As AI-generated content becomes more prevalent, distinguishing between original and synthetic content becomes a growing challenge.
Efforts to develop detection tools, such as AI watermarking, are underway, yet they are far from foolproof. Watermarking is easily subverted.
Meanwhile, more broadly, the use of synthetic data is on the rise, particularly as more mavericks turn to AI to generate everything from online learning programmes to self-published books.
AI Suicide Prevention
To prevent AI from spiralling into self-destruction, it’s time to prioritise incorporating original, high-quality, diverse, and human-generated data into AI training processes. That requires encouraging independent critical and creative thinking.
Curated datasets need careful vetting. Otherwise, the future may be one in which the masses are spoon-fed artificial intelligence while the elite alone are nurtured by natural intelligence in the arts, creativity, and sciences.
By ensuring AI is trained on a wide range of data from different and original sources, we can maintain the diversity and quality of AI outputs, even as the volume of AI-generated content grows.
The Stench of Success
AI is on course to eventually ‘eat itself’, leaving behind a legacy of homogenised, uninspired, and ultimately ineffective systems.
Left unchecked, AI's self-destruction will not only be a loss for technological progress but will likely lead to a broader societal decline in creativity and originality.
The irony is that this decline might be celebrated as a triumph of efficiency and progress with banners (generated of course by AI) reading:
“All Hail the Heavy Lifting Miracle”
... a symbol of success in a world driven by output over quality.
And whilst the vapid stench of such a future will be overwhelming, it will be embraced as the new norm, with AI systems producing vast quantities of content that, while meeting quantitative targets, fails to add meaningful value.?
In today's tough world, 'just above average' is not enough. The benchmark needs to (and can) rise.
The future of AI hinges on our ability to recognise and mitigate the risks of self-destruction. By investing in professional training such as prompt engineering, human creativity, and critical thinking, and ensuring that AI models continue to learn from diverse, high-quality data full of original genuine human experience-based insights, we can safeguard AI's diversity and effectiveness for future generations.
If your organisation is interested in cost-effective tailored AI consultancy, please get in touch for a no-obligation discussion.
Multilingual Compliance Attorney – Financial Regulatory and IT / AI Compliance
2 个月So much for the Homunculus that we all feared would burst out of its vial to destroy humanity. Isaac Asimov’s foreboding conjurings of our future should be put to rest.