登录查看更多内容

Coming soon: stagnation and collapse of AI?

Andreas Amrein ? PhD MBA MSc

Global SVP and MD ★ Pharma / Biotech / MedTech ★ Expansions ★ Transformations ★ Acquisitions ★ Partnerships

发布日期: 2024年9月14日

Have we all been exposed to or directly used Artificial Intelligence in our jobs? After years of experiencing its growing influence, we are now beginning to notice a significant limitation—one that might call everything into question.

The core issue: AI's dependency on real data

AI relies on vast amounts of real-world text and images to improve, but the problem is that human-generated data is limited. AI companies are now scrambling to find a solution.?

What users are observing

Consider a striking example: When AI generates images of hands, the result is sometimes bizarre. The number of fingers might be wrong, or the wrists appear unnaturally bent. The reason? AI only learns from examples; it doesn’t "know" how many fingers a person has. It simply processes the images it's been trained on, and hands—especially in diverse, nuanced positions—are underrepresented in online datasets. Faces, on the other hand, are more frequently depicted, so AI is able to make them appear more accurate.

...but it’s not just hands...

You could replace "hands" with any other infrequently represented concept on the internet. The AI’s limitations extend to anything with sparse or inconsistent data, often resulting in factual inaccuracies or complete inventions.

The proposed solution

Some researchers believe that these so-called hallucinations can be reduced by increasing the amount of training data. After all, AI's recent advancements have largely been the result of massive training sets, not groundbreaking new techniques. The exponential growth in data used to train language models is undeniable.

However...

AI has already combed through most of the internet. The majority of publicly available data—Wikipedia, online forums, and digitized books—has already been processed, revealing a critical limit. Moreover, as AI continues to generate content, AI-produced data is being mixed back into the web, further complicating the training process (see recursive iterations).

The "Photocopy Effect": AI learning from AI

Nicolas Papernot from the University of Toronto, along with other researchers, has studied the effects of AI learning from AI-generated content. He compares it to making photocopies of photocopies—each copy is less accurate than the last. For instance, if an AI is trained to generate cat images based on 100 photos, 90 of which depict yellow cats and only 10 show blue cats, AI will likely make the blue cats appear more yellow. If another AI model is trained on those generated images, the "blue" cats might eventually disappear entirely.

This moment, termed the "collapse of the AI model," marks the point where AI’s creations no longer resemble reality. And the consequences of this are far-reaching—what happens when AI loses the “blue cats”? Applied to people, for example, we risk erasing important details, possibly leading to bias and the marginalization of minority groups.

AIM 1 年前

ODSC’s AI Weekly Recap: Week of January 26th

Open Data Science Conference (ODSC) 10 个月前

What is Claude AI? Explained

Blockchain Council 7 个月前

Is there hope? Let's look at games...

Interestingly, AI has shown remarkable progress in some areas. For instance, DeepMind’s AlphaGo learned the game of Go by playing millions of games against itself, achieving groundbreaking success. In 2016, AlphaGo made a move that no human player had ever seen before, thrilling the Go experts. This success demonstrates the potential of synthetic data, at least in domains with clear rules.?

But there are limitations of synthetic data

While games like Go have defined rules, language and images are vastly more complex. They lack clear metrics for success, and without rules, generating useful synthetic data becomes nearly impossible. Thus, AI's innovative language systems, rely entirely on examples rather than rules, further compounding the data problem.

Hence, AI companies’ desperate search for data

The scarcity of training data is forcing AI companies to seek alternative, often questionable, sources. For example, Meta has clashed with EU authorities over its intention to use users' posts and images to train AI. In other regions without strict data protection laws, it’s already doing so. According to a New York Times investigation, OpenAI has likely transcribed vast amounts of YouTube videos—potentially illegally—to train GPT-4. Google has also adjusted its terms of use, possibly allowing it to harvest data from restaurant reviews and public Google Docs.

Companies are pulling data from every available source because time is running out. Epoch AI estimates that by 2028, human-generated public content will no longer be enough to train better AI models.?

The ripple effect on blogs and media

Generative AI is already changing the content landscape on the internet. As more users turn to AI chatbots instead of traditional browsing, websites that rely on ad revenue from clicks are feeling the pressure. This impacts not only online magazines but also forums like Stack Overflow, where users once exchanged programming advice. With AI assistants now providing answers, these forums are losing their traffic—and another valuable source of human-generated training material disappears. Similar examples abound.

What comes next?

For AI to continue evolving, new approaches are needed. Innovations in how AI learns and how it extracts more value from existing data will be crucial. As we navigate this complex landscape, it's important to remember…at the heart of it all is us, the humans. Let’s keep in mind the quote attributed to Albert Einstein

“It has become appallingly obvious that our technology has exceeded our humanity.”

The future of AI remains uncertain, but one thing is clear—it’s going to stay interesting, and we have one more reason to continue to create human content, real data, instead of just consuming.

要查看或添加评论，请登录

查看全部

Coming soon: stagnation and collapse of AI?

Andreas Amrein ? PhD MBA MSc

Global SVP and MD ★ Pharma / Biotech / MedTech ★ Expansions ★ Transformations ★ Acquisitions ★ Partnerships

The core issue: AI's dependency on real data

What users are observing

...but it’s not just hands...

The proposed solution

However...

The "Photocopy Effect": AI learning from AI

领英推荐

Is there hope? Let's look at games...

But there are limitations of synthetic data

Hence, AI companies’ desperate search for data

The ripple effect on blogs and media

What comes next?

“It has become appallingly obvious that our technology has exceeded our humanity.”

更多精彩文章

社区洞察

其他会员也浏览了

No “I” in AI: Why LLMs are just “Frankenstein’s Data Monsters” - and what this means for our Collaboration with AI.

Open Letter To Pause AI Experiments Makes ‘ZERO’ Sense

True Precision in AI: Avoiding Costly Mistakes in Your Product and GTM Approach

When Does AI Get Out of Our Hands and We Into?Its?

What does AI Fairness look like in the world of GPT-4?

AI Struggles to Detect False Information Because Finding Truth is a Word Problem, Not a Math Problem

AI Hallucinations: Understanding the Issue and How to Avoid Being Misled

Model Collapse in?AI

??? Unlocking the Future of AI with Retrieval-Augmented Generation (RAG) ???

The AI effect: A business and professional new frontier

The core issue: AI's dependency on real data

What users are observing

...but it’s not just hands...

The proposed solution

However...

The "Photocopy Effect": AI learning from AI

领英推荐

Is there hope? Let's look at games...

But there are limitations of synthetic data

Hence, AI companies’ desperate search for data

The ripple effect on blogs and media

What comes next?

“It has become appallingly obvious that our technology has exceeded our humanity.”

Tempus Fugit: Time management strategies from military training that every business leader needs

2024年8月25日

Importance and Value of a Scientific Education for Careers in Business

2024年8月2日

20 years ago, a sales meeting...

2024年6月21日

Antibiotic Resistance – on the way to become #2 cause of death – action vs. hope

2024年6月20日

Using AI to de-Risk a Business (NO drama)

2024年6月6日

Gene therapy… for one out of three of us? What and how?

2024年6月2日

Hard truths… from Singapore to the business leaders

2024年5月21日

A life, medical treatment… how much are they worth?

2024年5月15日

Lawyers in Pharma and beyond, a quick thank you

2024年5月13日

Big Pharma and Biotech: Evolution and the Path Forward

2024年4月4日

社区洞察

其他会员也浏览了

No “I” in AI: Why LLMs are just “Frankenstein’s Data Monsters” - and what this means for our Collaboration with AI.

Open Letter To Pause AI Experiments Makes ‘ZERO’ Sense

True Precision in AI: Avoiding Costly Mistakes in Your Product and GTM Approach

When Does AI Get Out of Our Hands and We Into?Its?

What does AI Fairness look like in the world of GPT-4?

AI Struggles to Detect False Information Because Finding Truth is a Word Problem, Not a Math Problem

AI Hallucinations: Understanding the Issue and How to Avoid Being Misled

Model Collapse in?AI

??? Unlocking the Future of AI with Retrieval-Augmented Generation (RAG) ???

The AI effect: A business and professional new frontier