Foundation Models In GenAI
Hashan Wickramasingha Wadanambi (H.W.W)
IT Infrastructure Specialist | IT Infrastructure Services Management | IT Project Management | Cybersecurity | ISO/IEC 27001 Information Security Internal Auditor | Scrum Master | Strategy Implementation Professional
Generative Artificial Intelligence (GenAI) stands at the forefront of contemporary discourse, not solely within the confines of the IT Industry but resonating across global landscapes. Thus, H.W.W Articles has diligently delved into the realms of GenAI, disseminating insightful knowledge about its construction and application within professional spheres. Noteworthy definitions and conceptual elucidations have been drawn from authoritative sources such as the Technical Foundations and Terminology for Generative AI under the AWS Skill Builder program, enhancing the depth and credibility of the discourse.
As a general summary first, the unlabeled data is input into the foundation model with a process called pretrain. Then with prompts, the Foundation Model performs specific Tasks.
A Foundation Model is a prebuilt, machine learning (ML) model trained on a large amount of data. The result is a model that can be adapted to a wide range of downstream tasks.
?Foundation models are created by taking large amounts of unlabeled data, training a model with that data, and then using that model for a wide range of tasks. The mechanism of transforming unlabeled data to the Foundation Model is called pretrain.
Pretraining is the creation of an FM by training a model with terabytes of unlabeled text or multi-modal data (such as images, audio, or video)
In creating Foundation Models there are two vital factors. They are as below.
1) Unlabeled Data: Unlabeled data can be used at a scale for pretraining because it is a lot easier to obtain compared to labeled data.
2) Large model: A large model with billions of parameters can store richer, deeper context across large amounts of data compared to a smaller model trained on a smaller dataset.
The important factors in developing a large model are the quality and quantity of training data and the Training Infrastructure.
?To develop the Foundation model from unlabeled data Transformer (transformer models) ?does an important job. The transformer architecture is a type of neural network that is efficient, easy to scale and parallelize, and can model interdependence between input and output data.
The transformers (transformer models) do not process words sequentially one at a time. Instead, they process the entire input all at once during the learning cycle. This makes the training process highly parallelizable. Transformer models especially capture positional information. In other words, if a sentence is input then it identifies each word in the sentence and the relationship between each word. This is done by using processes encoded in mathematics. Position encoders allow transformer models to prevent ambiguous meanings when a word is used in other parts of a sentence. As an example “The new lamp had good light for reading.” and “Magnesium is a light metal.” have two different meanings. Although both sentences have word light, the meaning is different due to their position.
The foundation models can be categorized according to their capabilities.
Such as.
领英推荐
1.?????? Code Generation
2.?????? Content Generation
3.?????? Content Summarization
4.?????? Questions and Answers
It is possible to further describe the above capability categories.
This category generates code outside the integrated development environment of code developers, and it can be used if the developers choose to use this.
This category encompasses everything from creating engaging marketing content, such as blog posts, social media updates, or email newsletters, to generating unique, high-quality images, art, logos, and designs.
Foundation Models can take inputs, such as reporting data, call minutes, and long-form articles, to generate summaries. This can help in saving time and reducing errors.
This includes chatbots, which are one of the more popular uses of FMs from a consumer perspective. Businesses can use the Q&A capability of FMs to streamline the customer experience and help reduce operational costs.
In the current context, we use these capability categories in our day-to-day work. In the next article, H.W.W Articles wishes to write on how a generative AI model encodes and decodes words with simple examples as H.W.W Article intends to make complex things simple and share the knowledge with the world.
?
?