AI Generative
Marcelo Honorio Santos
Senior Software Engineer | Tech Lead | 20+ Years in Software Engineering | AWS, GCP, Azure Certified
Over the next days i will write about three themes deeply involved in generative AI. Foundation Model, Large Language Model and Retrieval Augmented Generation.?My goal here its explain the basic concepts about these themes that are base of Generative AI.
I hope that you appreciate it!
Let’s start with FM's!
A neural network pre-trained on extensive datasets, it is a Foundation Model essentially.?
Extensive AI systems trained on a huge size of data through self supervised learning.
This training process create versatile models that can execute a variety of tasks in a accuracy way, as example it can classify images, answer questions, process natural language.?
Google BERT from Google and GPT-n from OpenAI are great examples of Foundation Model.?
Differently traditional models that are built from scratch to do particular tasks, FM's utilize layered training approach, that are Base Layer, Middle Layer and Top Layer.
Involving generic pre-training on extensive data, Base Layers enable the model to learning from diverse content including text images.
Involving domain specific refinement, Middle Layer, honing the model’s focus in particular areas.
The final layer, Top Layer, fine tunes the model’s performance for specific applications such as text generation, image recognition or others AI tasks.
The Foundation Models are essentials, specially for 4 reasons.
With a incredible powerful eliminating the need to training separate models for different tasks. One single model can address multiple problems, then it is a UNIFIED SOLUTION.
领英推荐
The process to training a FM are straightforward as it doesn’t rely on labelled data, so a minimal effort is needed to adapt then to specific tasks, SIMPLIFIED TRAINING.
The process to achieve high performance for specific tasks without FM’s require a huge amount of labeled data. FM’s, however, only need a few example to be tailored to a given task, TASK AGNOSTICISM.
Enabling the creation of high performance models for various tasks. The leading architects in Natural Language Processing and Computer Vision are built upon FM’s, HIGH PERFORMANCE.
We can categorize FM’s in two categories: LLM’s and Diffusion Models.?
FM’s are trained on unlabeled datasets using a self supervised approach. There are no explicitly labelled datasets. Labels are generated automatically from the dataset itself and the model is trained in a supervised manner. We can say that this is the key difference between supervised learning and self supervised learning.?
With the capability to intelligently respond to prompts even on topics they haven’t explicitly ben trained, FM’s encounter hard challenges as: Infrastructure Requirements, Front End Development, Lack of Comprehension, Unreliable Responses, BIAS.
Infrastructure Requirements demands a substantial financial investment and extensive resources.
Front End Development must integrate FM’s into a software stack, incorporating tools for prompt construction, fine tuning and pipeline development.
Lack of Comprehension because FM’s struggle to grasp the contextual nuances of a prompt. Moreover, they lack social and psychological awareness.
Unreliable responses because answers provided by FM’s regarding certs subjects can be unreliable, occasionally veering towards inappropriate, toxic or erroneous responses.
FM’s are susceptible to BIAS. To minimize this risk, developers should meticulously curate training datasets and embed specific norms into their models.
The potential applications for FM’s are large in all business areas and industry, so I strongly recommend learn about it!