登录查看更多内容

Unlocking the Secrets of AI: How Large Language Models Understand Human Mistakes - and Ignore Them.

Andrew Amann

Nuclear Sub Engineer ?? CEO of NineTwoThree Studio ??? Built 150+ AI, ML, Mobile & Web Apps ?? Top AI Agency in America ?? Launched 14 Startups??2 Acquisitions ?? 2 US Patents ?? 4 years, Inc 5000

发布日期: 2024年5月2日

Are LLMs the path towards AGI? Can they create new ideas from predicting the next token? I always thought no....until I learned this ??????

How can something be smarter than its inputs?

One of the core arguments I keep circling back to is that these models can never be smarter than the data they were trained on. It's an inescapable fact - a language model is simply an advanced statistics engine - drawing patterns and knowledge from the information it has ingested - and then just predicting the next word. Over and over again.

But then I heard a podcast from Dario Amodei, the CEO of Anthropic, and he points out a ridiculous phenomenon with their latest model, Claude.

When tasked with addition problems, Claude demonstrates a higher degree of accuracy of getting the arithmetic right than the humans did on the internet.

To be clear. If you crawl 100 websites that have math problems using addition, there might be 2 or 3 sites with errors.

But Claude is not making mistakes even at 20 decimal places.

How is it possible to get math problems correct when the training data has mistakes?

The key insight is that large language models don't just blindly regurgitate what they've been exposed to. Through their training process, they develop an underlying comprehension of core principles and logical rules. This allows them to parse accurate information from inaccurate, separating the wheat from the chaff in that vast ocean of data.

Anil A. Kuriakose 1 个月前

Leveraging Heisenberg's Uncertainty Principle to…

Steffen Reckert 5 个月前

Reasoning AI - The real Game-Changer behind Large…

Vlad Larichev 7 个月前

In the case of 20-digit math, while the training corpus undoubtedly contains countless errors, Claude has identified the correct fundamental operations required to solve these problems with higher fidelity than any single human reference.

This is a profound realization with staggering implications. If these models can surpass our collective ability for procedural tasks like arithmetic, what other frontiers might they conquer through scaled data synthesis and first-principles reasoning?

Could they ultimately crack problems that have vexed humanity for centuries? Resolve paradoxes in theoretical physics? Unravel the deeper patterns underlying intelligence itself?

I personally never thought this was possible.

The machine can use the first principals by understanding what information is true in many cases and understand honest mistakes.

Of course, this is all still highly theoretical and could be the result of really really good human reinforcement training. Large language models remain flexible prediction engines, not sentient beings. Their "knowledge" is an illusion crafted from statistical correlation, not authentic understanding. The humans trained the machine to predict the next word by telling it when it was right and wrong. So potentially - math could just be an output of something simple and truthful that any human tester can inform the machine the correct answer.

But the arithmetic example hints at an intriguing path forward. By ingesting all available data, extracting kernels of truth, and applying rigorous logical reasoning, these models offer a new form of hybrid cognition.

Lets be clear, this is not artificial general intelligence, but something anchored in the realities of information theory - let's call it Aggregate Transcendent Intelligence. And sure - we might still need to break LLMs down to agents to build back up to AGI - but this is pretty cool.

Justin Kistner

Founder CopyClub.ai and CopySub | Artificial Intelligence and Writing

6 个月

I enjoyed reading this line of thinking. I understand why there is a focus on whether or not LLMs are the system architecture that enables AGI—especially for investors and inventors interested in achieving AGI. However, LLMs don't have to achieve AGI to be incredibly useful tools for spotting patterns that we might not see. Especially when the data they are trained on points to an inferrable conclusion, much like the way the Standard Model was able to predict Higgs boson particle. I would expect LLMs to continue identifying deep patterns across scientific literature to make novel predictions and guide new experiments. By analyzing arguments and ideas across the corpus of philosophical writing, LLMs may be able to resolve long-standing paradoxes or synthesize new perspectives on age-old questions about the nature of reality, consciousness, ethics, etc. I could imagine them analyzing data on historical events, politics, economics, psychology etc., that could potentially help devise highly sophisticated geopolitical and business strategies by reasoning through second and third-order effects.

要查看或添加评论，请登录

查看全部

Unlocking the Secrets of AI: How Large Language Models Understand Human Mistakes - and Ignore Them.

Andrew Amann

Nuclear Sub Engineer ?? CEO of NineTwoThree Studio ??? Built 150+ AI, ML, Mobile & Web Apps ?? Top AI Agency in America ?? Launched 14 Startups??2 Acquisitions ?? 2 US Patents ?? 4 years, Inc 5000

How can something be smarter than its inputs?

How is it possible to get math problems correct when the training data has mistakes?

领英推荐

The machine can use the first principals by understanding what information is true in many cases and understand honest mistakes.

更多精彩文章

社区洞察

其他会员也浏览了

Mapping the Mind of a Large Language Model

The Rise of Small Language Models

Large Language Models vs. Short Language Models

Thinking LLMs: A New Frontier in Language Model Intelligence

Human Language: The DNA of Thought and the Key to AI?

#115 An In-Depth Look at Elo and MMLU Scores for Leading Language Models

From Weights to Words: A Beginner’s Guide to GenAI and LLMs

My dAI: Deciphering LLMs

SAMBA - A New Chapter for State Space Models

Mastering Prompt Techniques for Language Models

How can something be smarter than its inputs?

How is it possible to get math problems correct when the training data has mistakes?

领英推荐

The machine can use the first principals by understanding what information is true in many cases and understand honest mistakes.

Where are the Entrepreneur Olympics? A Story of How Entrepreneurs are Never Celebrated

2024年8月12日

I Asked AI To Review the Presidential Debate as a Historian (With Neutral Results) and as the Candidates Mother's (With Hilarious Results)

2024年7月1日

I Successfully Tricked ChatGPT for Financial Gain

2024年6月21日

ChatGPT UI is Intentionally Googly, Helping Users Understand A.I.; Just Like How Steve Jobs Stated We All Used Televisions Wrong for 30 Years.

2024年5月13日

Reasoning with Large Language Models Instead of Asking Questions. How Agents will save the In-Context Prompting Problems.

2024年4月30日

Earning $3,304,011 from 40 Pieces of Advice from the Rhodium Network

2024年4月18日

How Game Theory Teaches Us That ChatGPT Is Not The Only Tool for B2B Problems.

2024年4月2日

Increasing Revenue in an Agency by Focusing on Utilization and discovering how to measure productivity rate

2024年4月1日

How Designing for AI Chatbots Pave the Way for a Future of Private and Personalized Conversations with Apps

2024年3月22日

Top 5 AI Development Agencies in 2024

2024年3月6日

社区洞察

其他会员也浏览了

Mapping the Mind of a Large Language Model

The Rise of Small Language Models

Large Language Models vs. Short Language Models

Thinking LLMs: A New Frontier in Language Model Intelligence

Human Language: The DNA of Thought and the Key to AI?

#115 An In-Depth Look at Elo and MMLU Scores for Leading Language Models

From Weights to Words: A Beginner’s Guide to GenAI and LLMs

My dAI: Deciphering LLMs

SAMBA - A New Chapter for State Space Models

Mastering Prompt Techniques for Language Models