Generative AI on Foundation Models: AI Paradigm Shift

Generative AI on Foundation Models: AI Paradigm Shift

Generative AI is the current buzzword in AI, it is used to generate images, code, video, text, and art by interpreting and manipulating pre-existing data. Generative AI powered by Large Language models that are pre-trained on vast amounts of data is commonly referred to as Foundation Models.

The term “Foundation Model” was coined over a year ago by a team of researchers at Stanford University. These Foundation models contain billions of parameters or variables trained enabling them to perform a wide range of tasks that span multiple domains, like writing blog posts, generating images, solving math problems, engaging in dialog, and answering questions based on a document.

Evolution of Foundation Models

In 2018, Google released BERT as open-source software, spawning a family of follow-ons and setting off a race to build ever larger, more powerful LLMs.

BERT:GPT2: ELMo: GPT3: ChatGPT: LaMDA: BLOOM: Chinchilla: GPT4……

Due to the availability of immense computing power at a reasonable cost, there has been an explosion of FMs starting in 2022 as more and more FMs are being introduced by the key players into the market.

Why Foundation Models

Creating and deploying traditional AI models for each new system often requires considerable time and resources.

Foundation models are trained with hundreds of gigabytes of data and contain hundreds of billions of parameters. The key aspect of foundation models is their ability to utilize self-supervised learning techniques and leverage large-scale unlabeled data. These models can learn complex representations and features, greatly improving their performance on various tasks and reducing dependence on labeled data.

Instead of spending millions of dollars on high-performance cloud GPUs to train a machine learning model that can only be used for a specific task, companies can use a pre-trained model and focus on contextualizing and fine-tuning the pre-trained model Foundation model. The customized foundation model can be employed for many tasks as opposed to the previous technologies that required building models from scratch in each use case.

Key Characteristics of Foundation Models

  • Foundation models trained on large unlabeled data sets at scale, for example, OpenAI GPT-4 have been trained on 100 trillion parameters.
  • Homogenization refers to the fact that a small number of deep learning architectures were being used to achieve state-of-the-art results on a wide variety of tasks.
  • Emergence is the idea that new behaviors can emerge from an AI model that was not originally intended in its training.
  • Foundation models are Generalized to handle a wide variety of tasks.
  • Foundation models are Flexible that can be customized to solve one type of problem to another with relative ease.
  • Foundation models are designed to be Scalable, can handle vast amounts of data, and grow in complexity as required.
  • Foundation models have remarkable versatility, as they can be employed across multiple domains and industries.


No alt text provided for this image
Generative AI on Foundation Models - a Quick Glimpse

Key Players

OpenAI is an AI research and deployment company that conducts research and implements machine learning. The trending AI application, ChatGPT was created by OpenAI.

Microsoft has tied up with OpenAI to provide Azure Open AI Service that enables customers to customize OpenAI models with labeled data for their specific scenario using a simple REST API and provides for Secure deployment with role-based authentication and private network connectivity.

AWS has recently launched Amazon Bedrock in a limited preview that would give enterprises the flexibility to choose from a wide range of FMs built by leading AI startups and Amazon.?It would also provide infrastructure to privately customize FMs with enterprise proprietary data. AWS also provides developers with SageMaker Jumpstart which enables them quickly and easily get started on integrating with open-source LLMs like hugging face and Stability AI.

AWS also provides hardware in the form of AWS Inferentia (Inf1) instance that has the lowest cost per inference in the cloud for running Deep Learning Models, AWS Trainium (Trn1) that is the most cost-efficient, high-performance training of LLMs and diffusion models, and AWS Inferentia (Inf2) that provides High-performance at lowest cost Per inference for LLMs and diffusion models.

Google has Vertex AI that offers the simplest way for developers and data scientists to take advantage of foundation models like PaLM in a way that provides them with the most choice and control. Google has also announced its plan to integrate Generative AI Service AI Snapshot with Google Search.

Hugging Face is a Model Hub with over 120k models, 20k datasets, and 50k demo apps (Spaces), all open source and publicly available, in an online platform where people can easily collaborate and build ML together. It provides Foundation Models for Natural Language Processing, Computer Vision, Multimodal, Audio, Reinforcement Learning, and Tabular Data.

Co:Here provide?natural language processing?models that help companies improve human-machine interactions.

AI21Labs provides Natural Language Processing models that can understand and generate natural language.

Stability.AI is an AI-driven visual art startup that designs and implements an open AI tool to create images based on text input given.

Meta AI provides Large Language Model Meta AI collection that comprises models smaller than existing LLMs but trained on more tokens. This boosts performance and makes the models easier to retrain and fine-tune for specific real-world use cases.

Key Foundation Models

GPT-3 was created by OpenAI, “GPT” stands for “Generative Pre-trained Transformer.” It’s an example of machine-learning algorithms trained on enormous troves of internet data to swiftly generate new material with minimum input.

GPT-4 is trained on 100 trillion parameters, which helps it generate meaningful and coherent text using the transformer architecture, it can generate images and audio, making it a versatile model that can perform tasks across multiple modalities.

DALL-E is created by?OpenAI?which translates text prompts into images. DALL-E is based on a neural network that generates new images from prompts with the help of artificial intelligence. DALL-E 2 is the latest version which generates a higher number of images with better resolution, and it can also edit images afterward.

BioGPT?is a?language model transformer?that has been developed by Microsoft researchers. It has been pre-trained with a 15 million PubMed abstracts corpus. Its primary function?is to answer biomedical questions. BioGPT model helps to?accelerate medical research for treatments and drugs, so its reliability?is essential.?

LaMDA is created by Google by fine-tuning a family of Transformer-based neural language models it is specialized for dialog-based applications.

LLaMA is created by Meta a collection of foundation language models ranging from 7B to 65B parameters, it has similar usage as GPT models.

BLOOM is created by Big Science, and the architecture of BLOOM is essentially similar to GPT3. It can be used for text and code generation.

Stable Diffusion was created by CompVis group at Ludwig Maximilian University of Munich. It is primarily used to generate detailed images conditioned on the text description.

Claude is created by Anthropic and used for the creation of AI Assistants.

PaLM is created by Google that is trained as a 540 billion parameter, densely activated, autoregressive Transformer on 780 billion tokens. As the scale of the model increases, the performance improves across tasks while also unlocking new capabilities.

Titan is created by Amazon that is pre-trained on large datasets, making them powerful, general-purpose models. There are two Titan models. Titan Text is a generative large language model (LLM) for tasks such as summarization, text generation, classification, open-ended Q&A, and information extraction. Titan Embeddings is an LLM that translates text inputs into numerical representations (known as embeddings) that contain the semantic meaning of the text. It is used for applications like personalization and search.

Foundation Models – Usage across Industries and Functions

Foundation models can be applied in horizontal functions across all industries like in

  • Marketing & Sales: Generating new campaign ideas, Conversational AI for Customer Support, Personalized Offers, and Create Branding Designs.
  • Customer Support:?Conversational AI, Call Transcript Analysis, Cross-sell / Upsell Real-time Assistance.
  • Operations: Document Classification, Language Translation, Data Analysis, and Report Generation.
  • Risk & Legal: Review and summarize legal documents, Contract Clause Extraction, and Search and retrieve relevant documents.
  • HR: Resume Analysis, Providing self-service for onboarding new candidates, Search & discovery, and Policy FAQ.

Foundation Models have the potential to be applied in various industries, such as

  • BFSI: Personalized Financial Plan, Regulatory changes, Claims Processing, Insurance Policy FAQ, and Financial Information extraction.
  • Life Sciences and Healthcare: Patient History Summary, QA on Drug-Disease, Experiment Protocol QA, Research Papers Data Analysis
  • Manufacturing: Product Improvements Recommendations, Market Research Recommendations, Product Quality Inspection Report Summary, Price Comparison Chart
  • Retail: FAQ on Products, Personalized Product Recommendations, Customer feedback review, Inventory re-ordering
  • Government: Automate Application Processing, Creation of Audit reports, Create Press Releases, Translate Policy Documents
  • Technology: Code Generation, Analyze Application Performance Metrics, Protecting IT Infra.
  • Media & Entertainment: Sub-title generation in multiple languages, New Music compositions, Movie Recommendations, Trivia Generation
  • TTH: Travel Plan QA, Customer Feedback analysis, Personalized Tracking, Travel Marketing ideas

These use cases listed above are indicative and this list of use cases is expected to grow with the adoption of the FMs.

Adoption of Foundational Models

While the Foundation Models are being matured and evolving, Enterprises are looking forward to adopting these models into their business services. Adoption of the pre-trained foundation model ranges from as-is consumption like generating logos and images for campaigns to having a need to contextualize by training the LLM that is specific to the task or domain, using a smaller dataset, and tuning the model to better fit the new data.

The foundation models are available as a managed service from the model builders who charge a price for fine-tuning and inference. Self-managed foundation models provide enterprises with more control over the model and are more secure as the data used for customization and inference does not leave the enterprise’s boundaries.

Key Considerations

  • Models prone to Hallucination leading to false information presented in a context of factually correct information, with untrue data.
  • Foundation models are often described as “black boxes.”, Results from the models are poorly understood. Efforts are ongoing to provide explainability and Interpretability of the results generated by the foundation models.
  • Ensuring the Quality of training data, ensuring that data will align with business values by avoiding stereotypes, unfair discrimination, exclusionary norms, toxic language, etc.
  • Language models need to be fine-tuned for specific domains, certain use cases, or private/sensitive data.
  • Foundation models often require access to sensitive data, such as customer information or proprietary business data. This can raise concerns about privacy and security, particularly if the model is deployed in the cloud or accessed by third-party providers.

Applying Foundation Models

The Enterprises have started adopting foundation models across various functions and domains for a wide range of applications. Here are a few scenarios where we can adopt Foundation models quickly in our Generative AI journey.

Virtual Assistants powered by FM

Enterprises have adopted virtual assistants for customer self-service, the assistants become more powerful when they are built using FMs. The FMs are trained with vast amounts of data and can be adapted with domain context to answer a wide range of customer questions with succinct answers in natural language. Some of the implementations of virtual assistants are answering HR Assistant that answers policy questions, Medical Assistant that provides answers to disease-related questions, Banking Assistant that provides answers to account and product questions, and Claims Assistant for claim status.

Improve Developer Productivity

With FM-powered code recommendations, the development of applications using languages like C#, Java, JavaScript, Python, and TypeScript can be accelerated. It helps developers write code faster by generating entire functions, logical blocks of code, code suggestions as the developer is typing in the code, generating unit test cases, finding bugs, and suggesting how to remediate them. They can also be used for security scans to detect vulnerabilities and provide suggestions to eliminate the vulnerabilities. Amazon CodeWhisperer and GitHub CoPilot are code companions that are built on FMs.

?Two-Way Language Translation

Two-way language translation enables people speaking different languages to communicate with each other without having to learn the language. This is especially useful for contact centers that are short-staffed with agents who know a particular agent or leverage the expertise skills of agents who have the know-how to resolve issues but do not know the language the customer is conversing in. FMs can be used to translate questions from customers into a language that the agent is comfortable with and translate the response from the agent into the language that the customer uses removing the language barrier. It will also alleviate the need for contact centers to staff language specialists for a specific language that may have a low volume of calls.

Summarization using FMs

FMs perform exceptionally well in drawing inferences from a large volume of text data. FMs can be used in cases like processing legal documents for terms and conditions and providing the salient points in the documents crisply in a few sentences to a paragraph saving a lot of time for the analysts.

?Content Generation

FMs can be used to generate content for campaigns that need to cover a wide audience. Enterprises will need to reach the customers for new product launches, and exclusive offers to loyal customers. These mass e-mails can be generated using FMs by providing the FMs with the context, product, and contact information / Links and letting it generate an e-mail with a call to action that will gain the attention of the customers.

There is a wide range of Foundations models available both proprietary and open source and we need to select the appropriate model that best fits our use case.

What’s on the Horizon?

All the major players have jumped on the bandwagon of building Foundational Models, Microsoft has tied up with Open AI, Google is building its own Foundation models, AWS has launched the Amazon Bedrock service that enables AWS Customers to leverage Foundation Models from AI21 Labs, Anthropic, Stability.ai, and Amazon Titan with ability to privately customize the Foundation models.

The customers are looking for seamless integration of FMs with their application, security of their data and ensuring the data used for fine-tuning is not exposed to the public domain, and ensuring the inference cost is optimized. ?Customers using the outputs from FM models are also in need of explainability for the results and need to ensure that the results are not biased or inappropriate. Regulators are also catching up with recent developments and are in the process of framing regulatory frameworks. ?The foundation model builders would need to focus on these aspects for the adoption of their models for commercial use.

Several AI startups have moved into the space of providing tools for orchestration, evaluation, FMOps, and cost optimization, enabling enterprise adoption of foundation models.

These developments will lead to expedited, safe, scalable, and sustainable adoption of the Foundation model by enterprises. This results in productivity increase, improved customer experience, and accelerates research and development and new business models. It fosters an environment for continuous innovation ushering in the next wave of transformation in AI.

Hassan Ibrahim

Ex Snappers | AI Engineer @Reblium

1 年
回复

GPT-3, LaMDA, and BERT ( Google) are some popular examples of LLMs. ? These models excel not only in generating human-like responses but also handle diverse tasks including language translation, code generation, and increasing the efficiency of individuals and businesses. ? We do feel that if any business is not leveraging AI, then AI is not gonna replace you, but any other resource that knows the true potential of AI is gonna replace you. P.S. - We understand the importance of leveraging AI in business operations, that's why we are here to help you develop conversational chatbots or innovative products to enhance your customer experience. To know more about how we can help you , visit our website - bigohtech.com Kathirvelan Ganesan

回复
Swaroop Mulukutla

Program Manager, Business Rules, at Tata Consultancy Services

1 年

Very informative

回复
Mihir Patel

Industry solutions lead, AI.Cloud Unit, Tata Consultancy Services

1 年

Good thoughtfull article

回复
Vaidyanathan Raghuraman

Delivery Excellence Head

1 年

nice article

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了