Where does the AI get its knowledge from?

Where does the AI get its knowledge from?

The most frequent question my customers ask me: where does the AI* (Artificial Intelligence) get its knowledge from??

I always end up saying the same: the content of a LLM (Large Language Model) is a file that you load into a software that runs on a computer, nothing more. A minimum usable AI model is a 5 Go file, just the same size as a movie on a DVD. But if you want a top performer model, the files get much larger (and are not necessarily available to download).?

It doesn't need to search the Internet to answer, since it's raw knowledge in already compressed in the file.

To clarify, let me put this in writing my viewpoint. I hope it will help make things clear and actionable.?

What is an LLM??

LLMs are computer programs designed to “understand”** and generate human-like text. They can take text (some take images, sound and videos) as input and produce text as output, mimicking how humans communicate.

Where do LLMs get their knowledge??

The key is the data absorbed by the model during the training.?

The creation of the program is partially a determinist process where the structure of the algorithm is written by the programmer, and partially an automatic adjustment (training). The training part means that the parameters of the program are initially chosen randomly, but then these parameters (or weights) are adjusted in an iterative way by trial and error: the model is given a large quantity of text and it has to predict the next word, the missing word of a sentence, and other processes. Over time, the program becomes good at understanding the relationships between words and generating coherent text.

LLMs are trained on massive amounts of text from the internet. This data includes books, articles, websites, and code. By analyzing this vast collection, the LLM learns patterns in language: which words tend to appear together, how sentences are structured, and even some facts about the world. This is where the apparent knowledge of LLMs comes from. This training process is complex and costly in terms of computation power required.?

In concrete terms

At the end of this process, the parameters and algorithms are frozen and stored in files. These files must be loaded by software to begin providing chatbot functionalities. Sometimes, they are compressed (quantized) to reduce the resources required to run the software.

By "frozen parameters," I mean that the algorithm no longer evolves. To feed it new data, you must either provide it with reference texts when querying it or go back through the "training" phase described above. However, this is a process initiated or parameterized by a programmer, not spontaneously by the algorithm.

There are several libraries and programs. Ollama is one (relatively) easy option. The Ollama software can load the file (generally a .gguf file of approximately 5Go for a compressed 8 billion parameters model).

If you want top performer models, you'll need a server with specific hardware or use Cloud services.

Using LLMs that are already deployed by cloud providers is also great option, provided that you read the Terms of Services and work with trustworthy partners.

Choice is yours!

Limitations and further readings:

I made it simple on purpose. It’s a LinkedIn post, not a book. To know more about the technology, I can suggest several readings. Here is a first list from the most approachable to the most technical (but still nice to read).?

Generative Deep Learning, 2nd Edition By David Foster

https://www.oreilly.com/library/view/generative-deep-learning/9781098134174/

Natural Language Processing with Transformers, Revised Edition by Lewis Tunstall , Leandro von Werra and Thomas Wolf

https://www.oreilly.com/library/view/natural-language-processing/9781098136789/

Build a Large Language Model (From Scratch) by Sebastian Raschka, PhD

https://www.manning.com/books/build-a-large-language-model-from-scratch


But the BEST way to learn, is to practice in parallel.

The team of Hugging Face does a great job at that: https://huggingface.co/learn/nlp-course/en/chapter1/1

You'll also find gems on the Google Cloud GitHub too: https://github.com/GoogleCloudPlatform/generative-ai


Thanks for reading this article.?


*(what 95% of people call “AIs” are LLMs - Large Language Models- that are a specific technology among others that constitute “AI”)

** I put “understand” between quotes because, although the “Artificial Intelligence” words stick, we tend to project human traits to an algorithm, which can lead to complex behaviors from the user.?

要查看或添加评论,请登录

Benjamin LATLI的更多文章

  • D'où l'IA tire-t-elle ses connaissances ?

    D'où l'IA tire-t-elle ses connaissances ?

    La question que mes clients me posent le plus souvent est la suivante : d'où l'IA* -Intelligence Artificielle-…

    1 条评论