登录查看更多内容

There is no such thing as a Trained LLM

Vincent Granville

Co-Founder, BondingAI.io

发布日期: 2024年11月27日

What I mean here is that traditional LLMs are trained on tasks irrelevant to what they will do for the user. It’s like training a plane to efficiently operate on the runway, but not to fly. In short, it is almost impossible to train an LLM, and evaluating is just as challenging. Then, training is not even necessary. In this article, I dive on all these topics.

?? Training LLMs for the wrong tasks.

Since the beginnings with Bert, training an LLM typically consists of predicting the next tokens in a sentence, or removing some tokens and then have your algorithm fill the blanks. You optimize the underlying deep neural networks to perform these supervised learning tasks as well as possible. Typically, it involves growing the list of tokens in the training set to billions or trillions, increasing the cost and time to train. However, recently, there is a tendency to work with smaller datasets, by distilling the input sources and token lists. After all, out of one trillion tokens, 99% are noise and do not contribute to improving the results for the end-user; they may even contribute to hallucinations. Keep in mind that human beings have a vocabulary of about 30,000 keywords, and that the number of potential standardized prompts on a specialized corpus (and thus the number of potential answers) is less than a million.

领英推荐

Differentiable Manifolds

Patrick Nicolas 11 个月前

Upcoming Course: Synthetic Data, Explainable and…

Vincent Granville 2 年前

posteriors: Normal Computing’s library for…

Normal Computing 11 个月前

?? Read the full article here, also featuring issues with evaluation metrics and the benefits of untrained LLMs.

To learn more about LLM 2.0 and its radically different approach and next gen features, see my AI research papers and books, here. There you can sign-up to my free newsletter and discover the newest LLM advances. For instance, I am currently working on a public Nvidia corpus consisting of financial reports (PDFs), with new technology to produce multimodal agentic contextual chunks, retrieve information that standard Python libraries cannot detect, and introducing the concept of multi-index. I will write about it next week; the code and data are on GitHub already and fully tested.

Not the least, see here my large GitHub repository with corporate LLM use cases, as well as LLM applied to the entire Wolfram corpus with better answers (at least for professional users) obtained much faster with no training.

GenAI and Machine Learning

211,284 位关注者

Lwando Nkuzo (M. Eng Electrical, PhD Candidate)

Artificial Intelligence | Machine Learning | Computer Vision | Model Testing and Deployment

3 个月

What's your take on this Dr. Malusi Sibiya UNISA?

Saeid Borhani

Digital Transformations Engineer

3 个月

Dear Vincent, first of all, thank you for sharing this article. I completely agree with the main idea of this article, but any tool that is effective in making human life easier will be useful. Of course, the effort to truly realize artificial intelligence and introduce its benefits and possibilities to human society will be a commendable move. I enjoyed your article very much and I am following your work.

1 次回应

Sarra Nasir

Business Analyst at GAC Group (Shipping & Logistics) | Finance Billing, Tariffs & Taxation | CBAP Certified | AI Enthusiast

3 个月

Tanzeel Shaikh and Keith Satuku

1 次回应

Reg Hawkins

DevSecOps Architect, Agile Coach @ HCLTech | ISTQB, MCPS, SAFe?

3 个月

Are LLM's always learning?

1 次回应

Thore-Bj?rn Haugen

Founder of Codepage ?| "Knowledge isn't free. You have to pay attention." (Richard Feynman)

3 个月

Thank you so much for sharing! And maybe it's not just about training or optimising predictions. It's also about how well an LLM can actually respond to what the user wants. Best models can do more than just give answers. They can also ask questions to make sure they understand what you're asking (or, well, simulate this if they don't already). Another interesting aspect is the role of cultural and linguistic context. A lot of models operate on a global level, but differences in language, idioms or even technical jargon can really affect the results. So, how can we make sure that LLMs not only give generic responses but also ones that are culturally or linguistically adapted? I think we should consider taking a modular approach, where different sub-LLMs cover not only domains but also regional or linguistic specifics. I also find the idea of untrained (or minimally trained) LLMs fascinating, especially in terms of sustainability and efficiency. It would be great to explore further how these models can be used in environments where computing resources are limited, such as in small businesses or in regions with limited infrastructure. Such approaches could potentially revolutionise access to AI technologies. Have a great day!

1 次回应

查看更多评论

要查看或添加评论，请登录

Vincent Granville的更多文章

10 Tips to Design Hallucination-Free RAG/LLM Systems

2025年3月20日

10 Tips to Design Hallucination-Free RAG/LLM Systems

The NVIDIA #GTC25 conference in San Jose, this week, is one of the largest AI conferences of the year. Besides robotics…

11 条评论
LLM Challenge with Petabytes of Data to Prove Famous Number Theory Conjecture

2025年3月7日

LLM Challenge with Petabytes of Data to Prove Famous Number Theory Conjecture

For direct access to the full article with code, challenge, and dataset, follow this link. In my recent article…

6 条评论
Invitation to Attend the Top AI Conference of the Year: NVIDIA GTC 2025

2025年2月27日

Invitation to Attend the Top AI Conference of the Year: NVIDIA GTC 2025

If there is one major AI event that you don’t want to miss in 2025, that’s the NVIDIA GPU Technical Conference (GTC) in…

2 条评论
Spectacular Connection Between LLMs, Quantum Systems, and Number Theory

2025年2月24日

Spectacular Connection Between LLMs, Quantum Systems, and Number Theory

In my recent research on cracking the deepest mathematical mystery, with version 2.0 published yesterday and available…

10 条评论
How to Improve RAG / LLM Accuracy & Resilience with Change Data Capture

2025年2月8日

How to Improve RAG / LLM Accuracy & Resilience with Change Data Capture

Register here. Change Data Capture (CDC) aims at detecting and tracking changes made to data.

2 条评论
Using AI to Solve the Deepest Math Conjecture

2025年1月28日

Using AI to Solve the Deepest Math Conjecture

The proof of the seminal result in question significantly benefited from our home-made AI technology: see the…

8 条评论
10 Great AI, LLM & GenAI Courses and Certifications to Boost your Career

2025年1月22日

10 Great AI, LLM & GenAI Courses and Certifications to Boost your Career

Covering all the AI topics most sought after by hiring companies: agents, multimodality, model evaluation, LangChain…

7 条评论
Piercing the Deepest Mathematical Mystery

2025年1月20日

Piercing the Deepest Mathematical Mystery

To skip the high-level presentation and directly download the paper, visit the AI research section here, and look for…

8 条评论
9 Tips to Design Hallucination-Free RAG/LLM Systems

2025年1月14日

9 Tips to Design Hallucination-Free RAG/LLM Systems

Here I explain how we manage to avoid hallucinations with our home-made Enterprise RAG/LLM. The most recent article on…

19 条评论
LLM 2.0, RAG & Non-Standard Gen AI on GitHub

2025年1月3日

LLM 2.0, RAG & Non-Standard Gen AI on GitHub

Full article available here. In this article, I share my latest Gen AI and LLM advances, featuring innovative…

See all articles

There is no such thing as a Trained LLM

Vincent Granville

Co-Founder, BondingAI.io

领英推荐

GenAI and Machine Learning

211,284 位关注者

Vincent Granville的更多文章

社区洞察

其他会员也浏览了

? #ICML2024 accepted! CARTE: Pretraining and Transfer for Tabular Learning

June Gloom descends on Embedded

A Great Tree of Artificial Intelligence has Fallen: Douglas Bruce Lenat

Ch:14.1 Types of GAN's with?Math.

Learning to distill ML models

GPT-4 Will Be 500x Smaller Than People Think — Here Is Why

Learning AI and Data Science my journey

Net2Net: ACCELERATING LEARNING VIA KNOWLEDGE TRANSFER

AI and Computing at School

Master Deep Learning with Free Tesla K80 GPUs on Google Colab – Keras, PyTorch, and TensorFlow Included

领英推荐

GenAI and Machine Learning

211,284 位关注者

Vincent Granville的更多文章

10 Tips to Design Hallucination-Free RAG/LLM Systems

LLM Challenge with Petabytes of Data to Prove Famous Number Theory Conjecture

Invitation to Attend the Top AI Conference of the Year: NVIDIA GTC 2025

Spectacular Connection Between LLMs, Quantum Systems, and Number Theory

How to Improve RAG / LLM Accuracy & Resilience with Change Data Capture

Using AI to Solve the Deepest Math Conjecture

10 Great AI, LLM & GenAI Courses and Certifications to Boost your Career

Piercing the Deepest Mathematical Mystery

9 Tips to Design Hallucination-Free RAG/LLM Systems

LLM 2.0, RAG & Non-Standard Gen AI on GitHub

社区洞察

其他会员也浏览了

? #ICML2024 accepted! CARTE: Pretraining and Transfer for Tabular Learning

June Gloom descends on Embedded

A Great Tree of Artificial Intelligence has Fallen: Douglas Bruce Lenat

Ch:14.1 Types of GAN's with?Math.

Learning to distill ML models

GPT-4 Will Be 500x Smaller Than People Think — Here Is Why

Learning AI and Data Science my journey

Net2Net: ACCELERATING LEARNING VIA KNOWLEDGE TRANSFER

AI and Computing at School

Master Deep Learning with Free Tesla K80 GPUs on Google Colab – Keras, PyTorch, and TensorFlow Included