GenAI Core Topics Explained in Simple Pictures

GenAI Core Topics Explained in Simple Pictures

7 concepts, each explained with hundreds of pictures embedded into one-minute data videos.

The featured picture at the top illustrates xLLM, the new trend in modern LLMs and RAG architecture: far less expensive (no GPU needed), specialized, self-tuned, low-latency, non-hallucinating, and locally implemented with no data leaks. Currently under development for fortune 100 client. See details here.

GPU Classification: The Father of Neural Networks

These days, GPUs are used to train neural networks that have nothing to do with images or videos. Yet they were initially built to accelerate image processing and video games. Back to the original usage, the classification method in this data animation does the opposite: turning the training set (tabular data) into an image bitmap, perform the fuzzy classification as a bitmap transform in GPU, then turn the last frame back into tabular data. And voila! You performed classification in GPU. Ironically, without neural networks, just using a high-pass image filter.

Well, you may argue that it is a neural network in disguise, indeed one of the first use cases. A frame is just a deep layer. If the filtering window is very small as in the video, the neural network is very sparse and very deep with hundreds of hidden layers. If the filtering window is very large, one or two layers will do the job, and boundaries will be smoother. I won’t share my opinion on whether or not this is a neural network. Clearly, the computations and architecture are nearly identical.

The first frame is the original training set transformed into a bitmap. Black zones are regions unclassified yet. After a while, the whole feature space is classified, with relatively stable group boundaries: in short, we observe stochastic convergence.

Sampling Outside the Observation Range

Many GenAI techniques produce poor results when the training set is small. The reason is because none of the existing methods can sample artificial yet realistic values outside the training set range: below the minimum, or above the maximum. Not even for a single feature, let alone in higher dimensions with correlated features. All of them rely on quantiles generation at some point, and none of the quantile functions in Python offer this possibility. The classic solution consists of using bigger and bigger training sets or trillions of weights, to fix sampling issues. But you can do it a lot faster with much less data. The video below starts with the empirical distribution observed on a small training set, and then extends it as if your training set was far bigger. Pure magic, like reconstructing invisible observations! And you can generalize easily to higher dimensions.

Approximate Nearest Neighbor Search

Fast approximate vector search is a core component of most LLM/GPT apps, to find prompt-derived embeddings similar to existing ones stored in backend embedding tables built on crawled data. My xLLM system uses key-value rather than vector databases and variable-length embeddings (VLE) rather than fixed size, but the nearest neighbor search applies to both architectures, and in many more contexts.

To view all the topics and videos, access the detailed articles (free, no sign-up required), the Python code, use cases and datasets, follow this link.



Dr. B. Mini Devi

Director-Centre for Information Literacy Studies, Former Director- UIT (University Institute of Technology), Assistant Professor & Former Head-Department of Library and Information Science, University of Kerala.

6 个月

Thanks

回复
Salash Motiani

Inventor, Entrepreneur and Investor

6 个月

That's why I am shorting Nvidia.

Christel-Silvia Fischer

DER BUNTE VOGEL ?? Internationaler Wissenstransfer - Influencerin bei Corporate Influencer Club | Wirtschaftswissenschaften

6 个月

Thank you Vincent Granville

Christel-Silvia Fischer

DER BUNTE VOGEL ?? Internationaler Wissenstransfer - Influencerin bei Corporate Influencer Club | Wirtschaftswissenschaften

6 个月

Thank you Vincent Granville

要查看或添加评论,请登录

社区洞察

其他会员也浏览了