Generative AI - Strategic Considerations
Picsart AI Image Generator

Generative AI - Strategic Considerations

Generative AI?has been in the press recently after the launch of Open AI ChatGPT, and there’s been a lot of “buzz” from an audience that isn’t usually paying attention to artificial intelligence or NLP. Anyone who works in technology has likely had to explain this to both co-workers, friends, and family. The emergent performance of GPT-4 with close to 1 trillion parameters is impressive, and a lot of people are only now starting to understand and think about applications of this technology.

In this piece, I’ve tried to put some sanity into understanding the landscape.??I’ve been tracking AI/ML for years, but as I’ve had to explain this to several different audiences, I’ve stumbled upon few good article that are useful to share.??

Introduction

The generative AI that has captivated a general audience is?based on a?class of models termed as foundation models. A foundation model is any model trained on broad data that can be adapted to a wide range of retail tasks. Two recent advances have played a critical part in foundational models going mainstream:?transformers?and Large Language Models(survey), current examples include Google’s?BERT, Open AI’s text generation?(GPT), image generation?CLIP?& speech generation?Whisper, and Meta’s?LLaMA.?

These foundational models have given rise to generative AI interfaces like ChatGPT , Bard and Dall-E.?ChatGPT is an AI-powered chatbot application built on OpenAI's?GPT-4?implementation, Google Bard is built on Google's Language Model for Dialogue Applications?(LaMDA)?technology while Open AI Dall-E is an example of a multimodal AI application that identifies connections across multiple media, such as vision, text, and audio.?These interfaces are built using a technique called?Prompt Engineering.?

The foundational models are usually trained on large corpus – Wikipedia, Forums, Books – and are mostly factual. While they provide scale and work well for general interactions they are constrained as they lack recency – temporal understanding, website traffic, customer shopping behavior and transactional data. Despite the impending widespread deployment of foundation models, the industry currently lacks a clear understanding of how they work, when they fail (hallucinate), how they to accentuate harms, and what they are even capable of due to their?emergent?properties - behavior that is implicitly induced, rather than explicitly constructed.

It’s no exaggeration to state that most organizations are still coming to terms with what is possible.??While the general public appreciates this as a chatbot, there work underway to embed LLM with tools to construct more autonomous agents.??In other words, generative AI isn’t just about asking Lensa to make curious looking headshot and asking Bing interesting questions.?Generative AI is about to become a much larger part of almost every industry in ways that are difficult to predict.

In other words, Generative AI is the “big deal” you’ve read about in the New York Times and Wall Street Journal, and because of this it’s time to start thinking about some the strategic considerations that will accompany this change.

Strategic Considerations?

While the technology is “cool” there are some obvious considerations and far fetching implications:?

Cost?- Training large language models are prohibitively expensive, ranges $5-20 M for the newest (and largest) models and time consuming - e.g., Llama took 1M GPU hours to train.????And this is just to the models that the public knows about as some of the more recent models such as GTP-4 don’t openly publicize the number of parameters it is speculated is close to 1 Trillion (source).?From a cost perspective there will be a cost to train these models, and the companies that offer these models will create a business model to fund this services.??Some companies have already started, but we can expect that widely accessed LLM models will call for a new approach to billing and auditing usage.?The Cost side of Generative AI will get interesting very quickly as companies start to use this for many applications.

Digital Values?– The foundational models are closed-loop, do not provide context to track progress, understand, or document their capabilities and biases – things like fairness and bias, efficiency, or even environmental impact. Furthermore, they do not solve for Interpretability especially for the unexpected emergent properties they acquire.?Companies that make these services available are already having to invest in creating guardrails to ensure that a Generative AI service avoids situation that result in bias or violating fundamental values.?

Security and privacy?- Security and privacy for foundation models is uncharted at present. The Prompt injection, e.g., hateful speech targeted at a specific individual or company into a few outbound pages from Reddit, and security vulnerabilities, e.g., adversarial triggers to generate undesirable outputs, being top two challenges. Privacy risks are related to memorization of training data and regulatory compliance e.g., GDPR compliance relating to the right to be forgotten – Chat GPT recently got banned in Italy, rumors of Germany following soon.?

Legal?Violating copyrights when it stitches together bits of text to create a response. That is a question the legal system has yet to rule on. The US Copyright Office has issued guidance saying that the output of an AI system is not copyrightable unless the result includes significant human authorship, but it does not say that such works (or the creation of the models themselves) cannot violate other’s copyrights.?Foundation models rest on shaky legal footings at present, on how the law bears on both the development and use of these models is unclear.??

Authenticity – As these models start to generate content that is both convincing and realistic both in textual and visual form there will be a conversation about how these tools can be controlled.??In the last week alone, there was an instance of someone publishing?AI-generated music?on Spotify that was an accurate recreation of a well-known recording artist.??In another instance a photographer won a competition with an AI-generated photograph.?From Art to Music to Sports to Politics, the question of authenticity and authorship will become critical.

Technology Landscape?

Cloud Providers

The key Cloud providers, AWS, Azure and GCP, are investing billions of dollars on these services and integrating them with their existing AI offerings. As these providers build out new services, they are integrating them into existing services that support existing AI/ML use cases.

AWS provides serverless API service called?Amazon Bedrock?where you can choose from a wide range of foundational models, while Sagemaker being the entire ML Ops experience to build and deploy models. AWS is also has been investing in ML acceleration with Trainium and?Inferentia?chips as?high-performance, low-cost solutions to NVIDA GPUs.

Azure OpenAI Service provides advanced language AI with OpenAI GPT-4, GPT-3, Codex, and DALL-E models that are integrated with Azure cloud - private networking, regional availability, and responsible AI content filtering.?Azure has also?integrated this in its?cognitive services?portfolio to power application experiences like ChatGPT on enterprise data. There is?Azure Open AI Studio?to build?and deploy?models.??

GCP has a similar strategy. It has launched Vertex AI?PaLM API?(background) to test, customize, and deploy instances of Google’s large language models (LLM). It is also integrating the foundational models with enterprise search and conversational UI to help create AI interfaces.? Similar to Azure, it has launched Walled Garden and Generation AI Studio, a managed environment in Vertex AI where developers and data scientists can interact with, tune, and deploy foundation models.??

Open Source

Meanwhile there are?open-source?efforts too as enterprises’ demand for controlling the model and using it for targeted or specific use cases in contrast to close loop trained models. Databricks released an open source-based iteration of its large language model (LLM), dubbed Dolly 2.0?which was inspired by Alpaca, another open-source LLM released by Stanford in mid-March. Alpaca, in turn, used the weights from Meta’s LLaMA model that was released in late February. LLaMA was immediately hailed for its superior performance over models such as GPT–3, despite having 10 times fewer parameters.?Enterprises will be looking at building?adaptive?models with the goal to build the smallest model with high accuracy and performance to optimize total cost of ownership (TCO).

These adaptations fine tune the prompt-based methods to achieve favorable accuracy-efficiency tradeoffs and remove deficiencies of stand-alone foundation models.?Additionally, there is also a lot of work happening in the space of?Vector Databases?to?store the vector embeddings that result from the training of the LLM . A run time query search then finds the best match between the user’s prompt and the vector embedding.

Look Ahead

My Predictions as this space evolves.

  1. Prompt engineering as a field will evolve rapidly with?well-defined patterns.??In the last few weeks there have already been new job descriptions created for Prompt Engineers.??Whether that job title persists or not is unknown, but I predict that there will be a series of new roles defined by this technology.
  2. While this technology is?exciting, is it also very nascent. And while several providers have already established token-based approached Cost it isn’t well understood.?As we watch Generative AI develop companies will be trying to establish business around training, querying LLMS, and creating an ecosystem of new services designed to support this area.
  3. Local and national governments will preemptively create privacy & regulatory guidelines to govern generative AI.???While there’s a tremendous amount of promise here, there are also some risks.?These will need to be managed, but it’s an open question as to whether this manage will come before or after a crisis occurs in an industry that is trigger by generative AI.
  4. Large Language Models would be?fine-tuned?for domain specific use cases and would get cheaper to train and deploy.?We’re a moment in time in technology where a LLM achieved a result few predicted this soon, and as companies and organization work to understand possibilities, they will start to create more focus LLMs with similarly emergent properties.

?Contributions and review Tim O'Brien

Anil G.

Senior Distinguished I, Architect @ Walmart Global Tech | Sr. Director Architecture

1 年

Great article and thoughts on the GenAI Anil Madan! In my opinion, besides the need for huge GPU/VRAMs to support LLMs around 62 billion parameters, the enterprises will be faced with several other challenges-- includes managing data dependencies and Single Sources of Truth (SSOTs), ensuring the availability of resources, and addressing data privacy and security concerns etc.

回复
Nish Parikh

Engineering Leader, Google Shopping Personalized Recommendations

1 年

Anil Madan excellent article , summary and insights!

回复

Excellent article and compelling reading, so much to consider on the LLM journey from the cost points raised to the intriguing challenges around guardrails. We always talked about the mind-shift required from IT Teams when moving from on-premise to the cloud, this LLM shift for the business and IT teams is such a step beyond that.

回复
Himanshu Kapoor

Strategic Advisor & Technology Executive - Cloud | AI | Data | Customer Experience | Ex-Google, Cisco, IBM

1 年

Outstanding summary Anil of the key trends in this fast moving space. Comprehensive yet easy to understand. Thank you for sharing!

回复
Raghu Kaimal

HR Technology | HCM | Employee Experience Tech | People Analytics | Workforce Analytics | Future of HR | Future of Work

1 年

Anil Madan , Loved the perspective out here.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了