Charting GenAI's Course with a Tech-First, Multidisciplinary Approach
Philipp Masefield
Head Beyond Services @ AXA | Leadership in Business, People & Digital Transformations - hands-on & advisory | Insurance IT Executive (4yrs), Project Manager (PMP, 10+yrs) | Early-Stage Investor
TL;DR
The article, which chronicles the author's learning journey with generative AI over the past year since ChatGPT's viral launch, emphasizes the need to approach GenAI as a multifaceted phenomenon by applying the three lenses of Technology, Business, and Society. The article provides background on the decades-long development of AI, explaining how large language models represent a new "fourth wave". The piece outlines the differences between concepts like AI, machine learning, LLMs, and ChatGPT. It details the factors driving rapid advancements - algorithms, data, and compute power. The author emphasizes the need to conceptually grasp the technology when identifying business opportunities. Key technical concepts like models, prompts, and hyperparameters are introduced.[^1]
The next article in this series will cover the Business perspective.
Buckle up—the pace of change shows no signs of slowing after the launch of ChatGPT in late 2022 marked an inflection point for public understanding of artificial intelligence's potential. ChatGPT achieved viral user growth, as the fastest consumer app to, in just 2 months, reach 100M users. And a year in, it is still going strong with some 100M weekly active users.
I was not one of the first (few million) to get on board with ChatGPT; it was around year-end 2022 that this ‘viral thing’ caught my attention in earnest. Viewing the initial hype through Amara’s Law—whereby people tend to overestimate short-term effects and underestimate long-term impacts—I approached this new technology with cautious optimism. The understanding I had gained from dabbling in the business application of Data Analytics and Machine Learning (or what some now call “classic AI”) in 2018/2019 [^2] equipped me to recognize that ChatGPT was more than just a fun tool or a tech fad.I perceived ChatGPT as potentially transformative—a technology ripe with value-creation potential—prompting me to delve into it as an extended learning journey.
Why am I writing this article and sharing insights from this first-year journey with generative AI? I am convinced that articulating learning —not merely accumulating information—helps further solidify understanding, which is then refined through your reactions and discussions.
I have always considered myself a generalist, with a multi-disciplinary approach to new topics and to connecting the dots. With this ‘bias’, I strongly advocate the need to approach GenAI as a multifaceted phenomenon and with a willingness for ongoing learning, especially for a topic that is evolving as rapidly as it is. To understand GenAI and its implications, I like to apply the three lenses of Technology, Business, and Society, or - to use Design Thinking terminology - Feasibility, Viability, and Desirability respectively.
The exploration of these three interdependent lenses is an interdependent and iterative process, yet there's a logical sequence to navigate:
I am also structuring my writing about my learning journey according to these lenses, starting with Technology for the rest of this article.
Technology
With the intention of creating business value, which generally is my point-of-view, I see the need for a conceptual understanding of technology, rather than an in-depth technical expertise. This is necessary to identify the opportunities that the technological advances enable, to understand the challenges, to be realistic about the limitations, and — in a technology as rapidly evolving as Gen AI — also to have an appreciation of the advances that might bring new opportunities and shift or overcome current limitations. Or, to put it another way: "While not everyone needs to know the technical details, they should understand what the technology does and what it can and cannot do".
What Is GenAI
Defining Large Language Models
NVIDIA, clearly one of the big winners in the AI-race so far, defines Large Language Models (LLMs) as “deep learning algorithms that can recognize, summarize, translate, predict, and generate content using very large datasets”. Or Gartner’s glossary definition explains LLMs as “a specialized type of artificial intelligence (AI) that has been trained on vast amounts of text to understand existing content and generate original content”. Both are maybe a bit too short to be informative — AI might do better: Asking Perplexity.ai (with this prompt) for a definition of LLMs yields a more illuminating definition:
“Large Language Models (LLMs) are advanced artificial intelligence systems that have been trained on extensive text data to understand and generate human-like text. These models, which can have billions of parameters, are capable of remarkable generalization abilities, allowing them to provide plausible responses to a wide array of prompts, even in a zero-shot context where they haven't been specifically trained for the task at hand. They can be used in various applications, from answering questions and writing essays to aiding in planning problems. However, despite their impressive capabilities, LLMs also have limitations, such as the potential for generating inaccurate or misleading information, also known as "hallucination". Recent research has focused on improving these models, including methods for editing their behavior within specific domains, and techniques for controlling their outputs”
GenAI, LLM, GPT - Not All The Same Thing
AI, GenAI, LLM, ChatGPT, GPT-4 are not all the same. Sometimes it just helps (at least me) to sketch things out:
Artificial Intelligence (AI) is the broadest category, even covering almost forgotten approaches such as expert systems. MIT’s Professor Malone provides an intuitive definition for AI as “machines acting in ways that seem intelligent”.
Machine Learning (ML) refers to computer programs that learn from experience rather than relying solely on explicit programming to produce solutions. The main types of machine learning are Supervised Learning (which uses labeled data to map inputs to outputs), Unsupervised Learning (which identifies patterns for uses like recommendation systems or customer segmentation), and Reinforcement Learning (which relies on reward feedback).
Deep Learning is based on neural network architectures that loosely imitate brain structures, with ‘deep’ denoting the many layers within these networks.
Generative AI (GenAI), enabled by Foundation Models that allow broad application beyond narrow machine learning tasks, encompasses multi-modal generation of new content, which can be sound, images, or language. Well-known products to generate images would be Stable Diffusion, Midjourney, or DALL-E.
Large Language Models (LLMs) are a type of Generative AI specifically for Natural Language Processing tasks. The dominant architecture is the Transformer, as revealed in Google’s groundbreaking "Attention is All You Need" paper. And then of course there is ChatGPT (as the product), that was the viral awakening of us all to the world of GenAI. ChatGPT is based on the GPT-3.5 model series, or on GPT-4 with the Plus subscription. Beyond OpenAI's GPT-x models, there are other proprietary LLMs like Anthropic's Claude (currently at Claude 2.1) or Google's Gemini series. There are also open source options such as Meta's LLaMA or Mistral's Mixtral - just to name two of the most prominent models.
Throughout this article series, I will (try to) be deliberate in my terminology—generative AI, large language models, and ChatGPT are not interchangeable concepts. As a disclaimer, over this past year my focus has been almost exclusively on LLMs, as I see the most potential and relevance in that specific area of GenAI both for my work and interests.
Generative AI Is Not An Overnight Success
Generative AI has recently captivated not only the tech community, surpassing many expectations with its rapid advancements. Yet, it is important to understand that it represents not a sudden leap but the culmination of decades of developments. The current capabilities of AI might even be seen as a somewhat predictable development, in particular considering the 10X increase year-on-year of available compute over nearly a decade.
The field of Artificial Intelligence really started already in the 1950s with the introduction of the Turing Test, coining the term ‘artificial intelligence’, and the first (very primitive) machine learning from data with artificial neural networks. A MIT Sloan course on Artificial Intelligence delineates the subsequent development of AI in waves:
领英推荐
MIT’s late Professor Winston talked about a fourth wave that would go beyond the impressive machine learning advances of perception and recognition, and be about systems that would be more like us with cognition and reasoning. Large Language Models (LLM) could be seen as this fourth wave of AI. Mustafa Suleyman outlines and explains this development in his book The Coming Wave: In the mid-2010s, AI's leap forward was fueled by supervised deep learning, which relies on models learning from labeled data. The accuracy of AI predictions is often contingent on the quality of these labels. However, large language models (LLMs) represent a paradigm shift by successfully training on unstructured, real-world text, rendering the vast corpus of internet text a valuable resource. The 2017 Google research paper "Attention is All You Need" proposed the Transformers network architecture with attention mechanisms that laid the foundation for the revolution in LLMs. Since then, Transformers have been the driving force behind the rapid advancements. The acronym "GPT" stands for "Generative Pre-Trained Transformer", and OpenAI has since been at the forefront of this development, setting key achievements with their releases:
Within a few years, there has been an explosion of the capabilities of Large Language models, considering, that, as Suleyman notes, “it wasn’t long ago that processing natural language seemed too complex, too varied, too nuanced for modern AI.”
How Do LLMs Work?
There are countless explanations of how Large Language Models (LLMs) work, from the overly succinct “next word prediction” to highly technical and lengthy in-depth explanations. Here an attempt to provide a useful summary explanation [^Cent],[^3]:
LLMs are a class of deep learning models that have been trained on vast datasets of textual data, allowing them to develop a statistical understanding of language.
At their core, LLMs convert text into numeric representations that capture semantic meaning. Words are split into common groups of characters known as "tokens" that frequently appear together, like words or punctuation. Each token is then assigned a vector representation, with tokens of similar meaning placed close together in a high-dimensional "word embedding" space. This allows LLMs to discern linguistic relationships and nuances.
LLMs utilize a transformer architecture composed of encoder and decoder components. The encoder maps input text into the model's word vector space. Layers within the model then update these representations by exchanging contextually relevant information between tokens using an attention mechanism. This allows the model to construct meaning from the entire context rather than processing inputs sequentially.
After processing the input, LLMs are able to generate new text autoregressively, predicting the most likely next token at each step based on the previous tokens. With sufficient data and compute power, this statistical language modeling allows LLMs to reach high levels of coherence and even display emergent abilities like reasoning and summarization.
During training, LLMs are fed vast datasets of text and tasked with predicting masked words, learning associations between related concepts that may be distant within passages. Models are then fine-tuned on specialized datasets to optimize performance on specific tasks like translation or question answering.
Ongoing Advances
As I have pointed out in a previous post, the rapidly improving performance of LLMs has been and continues to be driven by a combination of three factors:
?? Algorithm or model sophistication is driven by talent, primarily in industry labs. There is also a counterintuitive reality around algorithms, as a MIT AI course reveals: As algorithms grow in size and complexity, their abilities expand. Deep learning algorithms, wielding tens of millions of parameters, defy expectations by becoming more proficient learners as they grow more complex. These sophisticated architectures enable a standardized method for processing varied data types by transforming inputs (whether words, images, or other types) into vector representations, simplifying the transformation of information across forms. This feature highlights the increasing versatility and complexity of machine learning models in deciphering and converting different data formats.[^Cent]
?? Data, which means harvesting vast amounts of internet data, enriched by specialized datasets. A development that has lead to the vast amounts of data available is, as Mustafa Suleyman points out in The Coming Wave, that “software has eaten the world” and that this means that there is data on almost anything which can now serve to train and improve AI systems.
??? Compute, requiring access to the most advanced computational resources and also the deep-pockets to finance this, which is motivated by the promise of outsized economic returns. An important consideration here is that access to compute can become a limiting factor for ongoing innovation. This is an important concern which has already been raised some time ago, as for example that even “exceptionally endowed university like Stanford can’t afford” the access needed to make significant contributions to the research agenda.
OpenAI's 2020 paper reveals a power-law relationship between language model accuracy and scaling, with performance improving as model size, dataset size, and compute resources increase. This trend held true over seven orders of magnitude, suggesting that ‘the larger, the better’. And considering the rapid AI advancements demonstrated over the past year, we seem to be in the steep part of the S-curve, with "a couple to a few more years of the exponential phase left to run". What further technological breakthroughs might we anticipate at this rate?
Even if AI advancements stagnate at the current state of the art, Ethan Mollick asserts that "there is a lot of juice left in GPT-4". This implies significant untapped business value even from today's available capabilities.
Key Concepts To Understand
There are a few technical concepts or terms that need to be understood to effectively work as a business person with LLMs. Some of the terms I’ve encountered in my ongoing experimentation, which make a difference for certain use cases, are the following:
In my second article, I'll explore the business perspective, sharing my views on market dynamics, the impact of GenAI on knowledge work, and the significance of practical, hands-on experimentation with use cases to create value - building on the conceptual understanding covered in the current article. My third article will delve into societal considerations, shedding light on some broader implications.
Endnotes:
Throughout the writing process, I have utilized LLMs to varying degrees, though any significant contributions are explicitly noted.
[^0]: Image description suggested by Anthropic’s Claude 2.1 based on the article’s text: "A brain with lightbulbs flashing above it, symbolizing the statistical understanding of language that enables large language models to generate human-like text. Technical concepts like neural networks and computational power are subtly indicated around the brain to capture the technological essence."
[^1]: Article summarize by Anthropic’s Claude 2.1, integrating an additional aspect with Mistral’s Mixtral 8x7B, and some final edits by me.
[^2]: While taking a course or two in Data Science, my curiosity shifted towards Machine Learning, particularly after completing @Andrew Ng's "AI for Everyone" course and Machine Learning Yearning. With this newly gained understanding, I became intrigued by the potential of leveraging advanced analytics and Machine Learning to address a significant business challenge. My hypothesis was that the migration of a legacy Life insurance book could be reframed as a data problem, solvable more efficiently through cutting-edge analytics and Machine Learning, as opposed to the traditional, hard-coded big-bang ETL approach. And yes, I managed to substantiate this hypothesis with a consulting partner through a successful Proof of Concept in 2020 (and just recently this massive migration has been completed).
[^3]: Based on my notes from the following sources:
[^Cent]: Written in a ‘Centaur’ mode: I provide my notes, then my AI ‘Ghostwriter’ persona drafts these rough notes into a coherent text, and finally I do the quality control and minor edits myself. (More on this in my next article.)
Your insightful approach to generative AI highlights the importance of a comprehensive understanding across various disciplines. ?? By delving into the technicalities and potential business applications, you're setting a strong foundation for leveraging AI in transformative ways. ???? Let's take your journey a step further by exploring how generative AI can enhance your work quality and efficiency. ?? Imagine the possibilities when these technologies are applied to your current projects! I'd love to discuss how generative AI can specifically benefit your endeavors. Book a call with us to unlock new opportunities and streamline your processes with AI. ?? Cindy ??
Arabic Localization QA (LocQA | QA tester) | ex-Apple | Multilingual Expert in Localization Quality Assurance | Polyglot: Arabic, French, Italian, English
9 个月Impressive journey! What societal impacts of generative AI have you discovered along the way?
AI Speaker & Consultant | Helping Organizations Navigate the AI Revolution | Generated $50M+ Revenue | Talks about #AI #ChatGPT #B2B #Marketing #Outbound
9 个月Count me in for this fascinating exploration of generative AI!