Generative AI: A Guide for Getting Started, Architecture, Selecting Vendors/LLMs, and Implementing Proof of Concepts

Generative AI: A Guide for Getting Started, Architecture, Selecting Vendors/LLMs, and Implementing Proof of Concepts


The objective of this article is to provide a high-level approach to the GenAi implementation journey end to end.

Below are the topics covered as you browse through the content-

  1. What GenAI Represents and Its Current Significance
  2. Understanding the Vocabulary of Generative AI and Large Language Models
  3. Generative AI Technologies/Vendor Segments
  4. Use Cases (Mostly retail specific)
  5. Reference Architecture
  6. Multiple Vendors and LLMs model - Public and Open source
  7. Navigating the Initial Stages of Generative AI Implementation, POC Approach


1. What GenAI Represents and Its Current Significance

GenAI technologies can generate fresh variations of content, strategies, designs, and methods by learning from extensive repositories of source material. Their impact on businesses is profound, influencing aspects like content discovery, creation, authenticity, and regulatory compliance, as well as the automation of human tasks and enhancement of customer and employee experiences.

At the core of GenAI are foundation models, with large language models (LLMs) being a prominent example. LLMs undergo training on massive datasets comprising billions of words and trillions of parameters, requiring complex mathematical computations and significant computational resources. Essentially, they function as prediction algorithms. For instance, in the case of ChatGPT and similar LLMs, they predict word patterns and sequences to generate coherent language responses to prompts or questions.

This generative capability is a hallmark of GenAI and extends beyond text generation to encompass code, images, videos, and audio. Moreover, GenAI can devise strategies and tactics. The quality of generated responses improves with the application of reinforcement learning with human feedback (RLHF) during model training, leading to more lifelike interactions. However, the human-like nature of interactions with systems like ChatGPT can sometimes evoke a sense of eeriness, reminiscent of the uncanny valley.

Input>Prompt>Process>Output

The popularity of ChatGPT has opened the floodgates of innovation in the generative AI space. The past six months have seen a flurry of AI foundation models, provider fine-tuned models, generative AI applications, and machine learning (MLOps) tools released in the market. In addition, many large incumbent ISVs are embedding generative AI into their existing applications to bring the power of generative AI to business users. Although these fast-paced developments signify the competitive jostling characteristic of most high-stakes, early-stage markets, this also presents a confusing array of choices that enterprise IT leaders need to navigate.

Some Assumptions

  • By 2026, more than 80% of enterprises will have used generative, artificial intelligence (AI) APIs, models and/or deployed generative, AI-enabled applications in production environments, which is a significant increase from fewer than?5% today.
  • By 2026, more than 70% of independent software vendors (ISVs) will have embedded, generative AI capabilities in their enterprise applications, which is a major increase from fewer than 1% today.
  • By 2026, nearly 80%?of prompting will be?semiautomated?— through?automated prompting tools?or via autonomous agents,?which require limited prompting.
  • By 2028, more than 50% of enterprises that have built their own models from scratch will abandon their efforts due to costs, complexity and technical debt in their deployments

2. Understanding the Vocabulary of Generative AI and Large Language Models

The complexity of terms and ideas in the area of generative AI is from the swift advancements in technologies, methodologies, and applications, particularly in the domain of large language models.

I am trying to summarize my learning in this article based on lots of secondary research, seminar and hands-on experience to educate and enable individuals to develop and implement solutions leveraging GenAI. A good start is first to keep yourself handy with the Gen Ai glossary which evolved over some time and you will keep hearing again and again..


The common terms have been put into three groups:-

·????? Models and training/learning methods.

·????? Content and prompts.

·????? Processing and engineering

?

2.1 Models and Training/Learning Methods

  • AI adoption policy: An organization’s announced goals on how it will adopt AI into its data processing strategies.
  • AI trust, risk and security management (AI TRiSM): Ensures AI governance, trustworthiness, fairness, reliability, robustness, efficacy and data protection. AI TRiSM includes solutions and techniques for model and application transparency, content anomaly detection, AI data protection, model and application monitoring and operations, adversarial attack resistance and AI application security.
  • Closed model: A model that no longer accepts inputs or changes to itself.
  • Custom model: A model built specifically for an organization or an industry.
  • Domain-specific model: A model that has been optimized and customized for the needs of specific industries, business functions or tasks. See also “horizontal model” and “vertical model.”
  • Edge model: A model that includes data typically outside centralized cloud data centers and closer to local devices or individuals — for example, wearables and Internet of Things (IoT) sensors or actuators.
  • Embedding: A set of data structures in a large language model (LLM) of a body of content where a high-dimensional vector represents words. This is done so data is more efficiently processed regarding meaning, translation and generation of new content.
  • Embedding model: A model used to transform unstructured data (like text or images) into multidimensional vector embeddings. These vectors are represented across the vector space, making it easier to perform tasks like similarity comparison, clustering and classification on the data, often as part of a RAG architecture.
  • Few-shot learning: In contrast to traditional models, which require many training examples, few-shot learning uses only a small number of training examples to generalize and produce worthwhile output.
  • Filters: Filters are used to remove data or variables from a model to simplify or eliminate options.
  • Fine-tuned model: A model focused on a specific context or category of information, such as a topic, industry or problem set.
  • Foundational model: A baseline model used for a solution set, typically pretrained on large amounts of data using self-supervised learning. Applications or other models are used on top of foundational models — or in fine-tuned contextualized versions.
  • Frozen model: A model that no longer accepts inputs or changes to itself.
  • Generative AI (GenAI): AI techniques that learn from representations of data and model artifacts to generate new artifacts.
  • Generalized model/general purpose model: A model that does not specifically focus on use cases or information.
  • Human in the loop: A process used when the machine or computer system is unable or not allowed to offer an answer to a problem autonomously, thus needing human validation or intervention.
  • Horizontal AI model: A model that has been optimized and customized for the needs of specific business functions or tasks.
  • Model hubs: Repositories that host pretrained and readily available machine learning (ML) models, including generative models.
  • Multimodal model: Language models that are trained on and can understand multiple data types, such as words, images, audio and other formats, resulting in increased effectiveness in a wider range of tasks.
  • Multitask prompt tuning (MPT): An approach that configures a prompt representing a variable — that can be changed — to allow repetitive prompts where only the variable changes.
  • Open model: A model that while operational continues to learn or can contextualize its responses based on inputs and prompts.
  • Open-source model: A model made available to the public through a license that enables anyone to access, use, modify and distribute the model source code without restriction, although some obligations may apply.
  • Parameters: A set of numerical weights representing neural connections or other aspects in an AI model with values that are determined by training. Large language models (LLMs) can have billions of parameters.
  • Pretrained model: A model trained to accomplish a task — typically one that is relevant to multiple organizations or contexts. Also, a pretrained model can be used as a starting point to create a fine-tuned contextualized version of a model, thus applying transfer learning.
  • Reinforcement learning: A machine learning (ML) training method that rewards desired behaviors or punishes undesired ones.
  • Reinforcement learning with human feedback (RLHF): A ML algorithm that learns how to perform a task by receiving feedback from a human.
  • Self-supervised learning: An approach to ML in which labeled data is created from the data itself. It does not rely on historical outcome data or external human supervisors that provide labels or feedback.
  • Supervised learning: An ML algorithm in which the computer is trained using labeled data or ML models trained through examples to guide learning.
  • Synthetic data: Data that is artificially generated and used as a proxy for real data in a wide variety of use cases including data anonymization, AI and machine learning development, data sharing and data monetization.
  • Tokens: A unit of content corresponding to a subset of a word. Tokens are processed and identified internally by LLMs and can also be used as metrics for usage and billing.
  • Transformer model: A deep learning model that adopts the self-attention mechanism, differentially weighting the significance of each part of the input data.
  • Transfer learning: A technique in which a pretrained model is used as a starting point for a new ML task.
  • Vertical AI model: A model that has been optimized and customized for the needs of specific industries.

2.2 Content and Prompts

  • Completions: The output from a generative prompt.
  • Content: Individual containers of information that is, documents that can be combined to form training data or generated by GenAI.
  • Corpora: The information or training data used to train an AI. An LLM, like GPT, uses any internet content for its corpora.
  • Specialized corpora: A focused collection of information or training data used to train an AI. Specialized corpora focuses on an industry — for example, banking or health — or on a specific business or use case, such as legal documents.
  • Grounding: The ability of generative applications to map the factual information contained in a generative output or completion. It links generative applications to available factual sources — for example, documents or knowledge bases — as a direct citation, or it searches for new links.
  • Prompt: A phrase or individual keywords used as input for GenAI.
  • Temperature: A parameter that controls the degree of randomness or unpredictability of the LLM output. A higher value means greater deviation from the input; a lower value means the output is more deterministic.
  • Training data: The collection of data used to train an AI model.

2.3 Processing and Engineering

  • Application orchestration framework: An abstraction layer to enable prompt chaining, model chaining, interfacing with external APIs, retrieving contextual data from data sources and maintaining statefulness (or memory) across various model requests.
  • Chain of thought: A prompt chaining method where each prompt in a series of successive prompts builds on the prior prompt’s results to build to a final result.
  • Fine-tuning: Improving an existing, pretrained model through additional training with new, context- or task-specific data.
  • Graph of thought: A prompt chaining method that models the information produced by an LLM as a graph where each vertex represents a unit of information (LLM thoughts), allowing for more flexible and efficient reasoning. One of the standout features of GoT is its extensibility, allowing it to adapt to a variety of tasks and domains.
  • GraphRAG: A technique to improve the accuracy, reliability and explainability of retrieval augmented generation (RAG) systems that uses knowledge graphs to supplement vector-based or other retrieval methods.
  • Knowledge graphs: Machine-readable data structures representing knowledge of the physical and digital worlds and their relationships. Knowledge graphs adhere to the graph model — a network of nodes and links.
  • Pretraining: The first step in training a foundation model, usually done as an unsupervised learning phase. Once foundation models are pretrained, they have a general capability. However, foundation models need to be improved through fine-tuning to gain greater accuracy.
  • Prompt chaining: An approach that uses multiple prompts to refine a request made by a model.
  • Prompt engineering: The craft of designing and optimizing user requests to an LLM or LLM-based chatbot to get the most effective result, often achieved through significant experimentation. It includes grounding or retrieval of augmented generation, model instructions, automatic prompt generation, prompt chaining, variable insertion and many other elements.
  • Retrieval augmented generation (RAG): Design pattern that uses search functionality to retrieve relevant data and add it to the prompt of a generative AI model in order to ground the generative output with factual and new information.
  • Tree of thought: A prompt chaining method where an ordered series of prompts are provided in one step to create a final result.
  • Tunable: An AI model that can be easily configured for specific requirements. For example, by industry such as healthcare, oil and gas, departmental accounting or human resources.
  • Vector databases: A type of database used in LLMs to store embeddings, which are representations of words as high-dimensional vectors that can efficiently search and retrieve related concepts.
  • Windowing: A method that uses a portion of a document as metacontext or metacontent.


3. Generative AI Technologies/Vendor Segments

The GenAI market is composed of the following segments, with vendors often present in multiple segments

3.1 Infrastructure Providers: These vendors offer the necessary hardware and cloud services to support GenAI needs, including computing power, storage, and networking. They also provide software for managing, optimizing, and scaling infrastructure for GenAI development and deployment.

3.2 Model Providers: This group offers access to foundational models like large language models (LLMs) and other generative algorithms such as GANs and evolutionary algorithms. Developers can integrate these models into their applications or use them as a starting point for creating custom models tailored to specific needs.

3.3 AI Engineering Services: This segment encompasses both established companies and startups that specialize in managing the entire lifecycle of AI models. They provide services tailored to developing, refining, and deploying generative models like LLMs and other GenAI artifacts for production use.

3.4 GenAI Applications: These applications leverage GenAI capabilities to enhance user experiences and streamline tasks. They can generate and modify text, code, images, and other multimodal outputs, enriching user interactions. Additionally, emerging GenAI agents can be prompted by users to automate workflows and accelerate task completion across various domains.

?

Overview


?4. Use Cases (Mostly retail-specific)?

·?????? Enhanced Search and Upselling

·?????? Social Media Customer Sentiment

·?????? Supply Chain Optimization

·?????? Conversational Chat Interface

·?????? Associate Hiring, Onboarding

·?????? Automated Text Creation

·?????? Social Commerce

·?????? Personalization for Customers

·?????? Best-Fit Apparel Technology

·?????? Customer-Centric Merchandising

·?????? Product Development, Selection

·?????? Automated Image Creation

·?????? Skills Management for Associates

·?????? Customer Behavior Modeling

·?????? Customer Order Substitution

·?????? Customer Subscription Services

·?????? Co-creation of Products

·?????? Ad from product description

·?????? Analyse results

·?????? Answering questions from a corpora ( what is Corpora check in the glossary

·?????? Sentiment analysis


Ask below question while picking the right use case:-

1.???? Increased Revenue: How can GenAI boost our sales and attract more customers?

2.???? Increased Efficiency: How can GenAI streamline our processes and save time?

3.???? Managed Risk: How can GenAI help identify and mitigate potential risks?

4.???? Nonfinancial Value: How can GenAI improve the experience for our customers and employees?

5.???? Technical Feasibility: Is our current infrastructure capable of supporting GenAI?

6.???? Internal Readiness: Are our teams prepared to integrate GenAI into our workflows?

7.???? External Readiness: Are our partners and stakeholders ready to embrace GenAI?


5. Large Language Models/ Gen Ai Architecture

These LLM design patterns give immediate guidance to execute on short-term opportunities, and provide a target state to plan ahead for more complex implementations..

While the system architecture is still evolving, I tried to create a data flow and high-level architecture based on the hands-on experience.


GenAi System Architecture

?

6. Multiple Vendors and LLMs model - Public and Open source

Evaluating and choosing generative AI?models?and vendors is not only about picking up the best ones in benchmarks such as?Holistic Evaluation of Language Models (HELM), Chatbot Arena - https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard or AlpacaEval. Data and analytics leaders need to look at these models through different lenses and select suitable ones for their enterprises.


Vendor and LLMs


These open-source models have been created by a variety of entities, ranging from:

  • Academic institutions (such as Stanford University)
  • Large technology companies (such as Meta)
  • Early stage and late-stage start-ups (such as Stability AI, MosaicML,?Databricks?and Together AI)
  • Nonprofit research labs (such as?EleutherAI).

?

list of some of the popular open-source generative AI models that are licensed for commercial use?

popular open-source generative AI models


7. Navigating the Initial Stages of Generative AI Implementation, POC Approach?

As an IT leader focused on leveraging generative AI to create business value, you should:

?

The most successful pilots focus on demonstrating business potential, not on

technical feasibility. Organizations tend to run technicaI pilots that simply

demonstrate that it is possible to build something with generative AI, leading to only

incremental improvements and ignoring the transformative potential of this

technology.

?

  • Run a workshop to generate use-case ideas with the business, focusing on the disruptive potential of generative AI and the way in which it can enable strategic objectives.
  • Prioritize the use cases for your pilot against their potential business value and feasibility. Focus on no more than a few use cases for your generative AI pilot.
  • Assemble a small but diverse team, including business partners, software developers and AI experts. Dedicate this fusion team for the duration of the pilot.
  • Create a minimum viable product to validate each use case. Identify the target business key performance indicator (KPI) improvement hypothesis, and define the deployment approaches and risk mitigations required to quickly test this hypothesis.
  • Deliver the minimum functionality required to test the use cases, and refine your assumptions on the cost and value of scaling them. Decide whether to stop, refine or scale each use case. Build upon initial successes to expand the generative AI pilot.

Gen Ai Pilot Cycle

Thanks for your time and browsing through the content. Share your side of the story- how you have approached to the GenAi journey .



Sources- Gartner reports, Chatgpt for proofreading, HBR, Microsoft, multiple other webinars and few close techie friends.

Naveenbabu Ananthan

Principal Architect | TOGAF | Retail Technology | Continuous Learning

5 个月

Nice read, thanks for sharing Ankur Goyal

Vinita Chandra

Helping global retailers operationalize Product Availability at different channels

5 个月

Short and Crisp,, Thanks for sharing

要查看或添加评论,请登录

社区洞察

其他会员也浏览了