Unlocking the Secrets of AI: How Large Language Models Understand Human Mistakes - and Ignore Them.

Unlocking the Secrets of AI: How Large Language Models Understand Human Mistakes - and Ignore Them.

Are LLMs the path towards AGI? Can they create new ideas from predicting the next token? I always thought no....until I learned this ??????


How can something be smarter than its inputs?

One of the core arguments I keep circling back to is that these models can never be smarter than the data they were trained on. It's an inescapable fact - a language model is simply an advanced statistics engine - drawing patterns and knowledge from the information it has ingested - and then just predicting the next word. Over and over again.

But then I heard a podcast from Dario Amodei, the CEO of Anthropic, and he points out a ridiculous phenomenon with their latest model, Claude.

When tasked with addition problems, Claude demonstrates a higher degree of accuracy of getting the arithmetic right than the humans did on the internet.

To be clear. If you crawl 100 websites that have math problems using addition, there might be 2 or 3 sites with errors.

But Claude is not making mistakes even at 20 decimal places.


How is it possible to get math problems correct when the training data has mistakes?

The key insight is that large language models don't just blindly regurgitate what they've been exposed to. Through their training process, they develop an underlying comprehension of core principles and logical rules. This allows them to parse accurate information from inaccurate, separating the wheat from the chaff in that vast ocean of data.

In the case of 20-digit math, while the training corpus undoubtedly contains countless errors, Claude has identified the correct fundamental operations required to solve these problems with higher fidelity than any single human reference.

This is a profound realization with staggering implications. If these models can surpass our collective ability for procedural tasks like arithmetic, what other frontiers might they conquer through scaled data synthesis and first-principles reasoning?

Could they ultimately crack problems that have vexed humanity for centuries? Resolve paradoxes in theoretical physics? Unravel the deeper patterns underlying intelligence itself?

I personally never thought this was possible.


The machine can use the first principals by understanding what information is true in many cases and understand honest mistakes.

Of course, this is all still highly theoretical and could be the result of really really good human reinforcement training. Large language models remain flexible prediction engines, not sentient beings. Their "knowledge" is an illusion crafted from statistical correlation, not authentic understanding. The humans trained the machine to predict the next word by telling it when it was right and wrong. So potentially - math could just be an output of something simple and truthful that any human tester can inform the machine the correct answer.

But the arithmetic example hints at an intriguing path forward. By ingesting all available data, extracting kernels of truth, and applying rigorous logical reasoning, these models offer a new form of hybrid cognition.

Lets be clear, this is not artificial general intelligence, but something anchored in the realities of information theory - let's call it Aggregate Transcendent Intelligence. And sure - we might still need to break LLMs down to agents to build back up to AGI - but this is pretty cool.



Justin Kistner

Founder CopyClub.ai and CopySub | Artificial Intelligence and Writing

6 个月

I enjoyed reading this line of thinking. I understand why there is a focus on whether or not LLMs are the system architecture that enables AGI—especially for investors and inventors interested in achieving AGI. However, LLMs don't have to achieve AGI to be incredibly useful tools for spotting patterns that we might not see. Especially when the data they are trained on points to an inferrable conclusion, much like the way the Standard Model was able to predict Higgs boson particle. I would expect LLMs to continue identifying deep patterns across scientific literature to make novel predictions and guide new experiments. By analyzing arguments and ideas across the corpus of philosophical writing, LLMs may be able to resolve long-standing paradoxes or synthesize new perspectives on age-old questions about the nature of reality, consciousness, ethics, etc. I could imagine them analyzing data on historical events, politics, economics, psychology etc., that could potentially help devise highly sophisticated geopolitical and business strategies by reasoning through second and third-order effects.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了