The Trouble with Generative AI: Pt.1

The Trouble with Generative AI: Pt.1

Level:?Foundational? Reading time:?9mins

In case you missed it, OpenAI’s ChatGPT generative AI (GenAI) tool was unleashed on the world on 30th November 2022. It achieved an unprecedented rate of technology adoption (accompanied by unprecedented hype).

Exact figures around GenAI investment are blurred but substantial. According to Pitchbook, in 2023 investors pumped $29.1 billion into GenAI deals. US Private Equity giant Blackstone have recently announced a £10 billion AI data centre to be built in Northumberland. What is unusual with GenAI though, is the scale of direct investment from Big Tech: (Amazon and Google with $2.5 and $4 billion in Anthropic, Microsoft with over $13 billion in OpenAI and Salesforce’s $4 billion investment in London’s First AI Centre).

For anyone wanting to create their own GenAI or Large Language Model (LLM), the barriers of entry are high. Models are trained on huge datasets for extended periods of time, which consumes a huge amount of compute processing power. For an appreciation of scale, OpenAI’s current ChatGPT 4o model has was trained for 90-100 days on 14,400 GPUs (computer processors optimized for AI) at a cost of $61 million (a conservative estimate). According to HSBC research, the cost of training ChaptGPT-5 could range from $1.7 to $2.5 billion. And that’s before factoring in the cost of employing a team of data scientists to create the model itself. Lastly, huge datasets for the LLM training need to procured to avoid high profile litigation. According to recent estimates OpenAI's annual revenues are between $3.5 billion and $4.5 billion, offset by $7 billion spent on model training and $1.5 billion staffing costs.

The benefits of GenAI were initially touted as efficiency gains for individuals - a reduction manual toil to achieve faster results. As a productivity tool sitting on worker’s desktop, Microsoft has vigorously exploited its Office365 market dominance with its integrated Copilot solution. Earlier this year, Open AI signed its biggest customer deal (75,000 ‘seats’) with PwC for its ChatGPT Enterprise product. Given its impressive ability to give convincing answers on seemingly any subject or question posed, some have dubbed GenAI as a solution looking for a problem.

Quanto Costa?

Powered by a huge amount hype, the investment and the promise, the reality is that GenAI is in its infancy, compared to say Machine Learning (around since 2011). As with any emerging technology subject to market forces, there follows a natural push towards commoditization. There are multiple GenAI LLM builders other than OpenAI, including Meta (Llama 3), Microsoft (Phi), Google (Gemini), Anthropic (Claude), Tencent, co:here and Mistral. As well as these so-called foundational or general purpose models, a range of cloud platforms allow you to host your own LLM.? Such platforms even allow multiple LLMs to be used in parallel.

As well as free (often older) LLM versions available to individual users, Hugging Face provides no-cost (open-source) models. In the UK, parts of the insurance and Fintech sectors have been keen to explore their license-free potential. For individual users or organisations wanting access to the latest LLMs (which generally have a higher performance benchmark), here are the cost basics.

If you’ve ever played with Generative AI, you probably noticed the LLM can follow a conversation, seemingly having a short-term memory. It’s made possible the LLM’s context window, or number of words (actually tokens) which can be input. Tokens are units of text, usually smaller than words, with 750 words equating to around 1000 tokens. Context windows are constantly increasing in size in newer models - ChatGPT4 has a context window of 8,192 tokens (approx. 6,100 words). Context windows in newer LLMs are becoming longer (Gemini 1.5 Flash has a 1 million token window), useful for summarising large documents, as well as much longer conversations.

Token-based pricing (on both input and output) is a common charge model. ChatGPT-4 costs $30 for 1 million input tokens and $60 for 1 million output tokens. There are the usual subscription tier models, as well as Enterprise commit pricing (think units of GenAI credits). Pricing is admittedly a dry topic - let’s just say there's every potential of unwittingly racking up a substantial GenAI bill, especially if its automated. Higher performance, larger models with more parameters may have a higher cost per token, but may not necessarily yield better results. A smaller LLM might be just as effective, as well as cheaper, for certain tasks.

Any disruptive technology with implications for people changing their habits, is going to be messy and unpredictable. Our newsfeeds have been inundated with AI-themed stories which seem to use AI terminology interchangeably - no wonder then that we’ve probably all got a different take. Do you see AI as more of a risk or opportunity?

Let’s debunk some AI terms and clarify what GenAI is and isn’t.

AI, Machine Learning and Generative AI ?

Before GenAI hit the public consciousness, the term AI mostly referred to Machine Learning. Machine Learning (ML), in simplest terms, is a model (or algorithm) that can make a prediction or inference, after being trained on large amounts of historic data. Unsupervised ML uses unlabelled data (without any guidance) to discover hidden patterns and clusters. A common use case for unsupervised ML would be market segmentation, grouping customers of similar traits to create target personas for campaigns. Supervised ML uses mostly human labelled data (ie expected output or what good looks like) to predict outcomes.

A common use case for supervised ML is fraud detection, whereby a bank blocks a transaction deemed suspicious or unusual. It might be the transactions size, its geographic location, the retailer involved or the purchase item itself (or indeed a combination of these parameters), which looks unusual. By training a model on customer’s previous banking data, inference can be made on a new transaction’s normalness and legitimacy. Fraudulent transactions can be flagged and blocked according to whatever thresholds and actions your bank sees fit. ML use cases tend to be specific, according to whatever large datasets or big data the models have been trained on. ML is also commonly applied to images, an established medical use case being diagnostic recognition (including cancer) of X-ray, MRI and CT scans.

ML Business Implication: An established technique which almost always requires training on well-organised big datasets of a certain consistency & quality, to produce useful results.

GenAI LLM, in simplest terms, can predict the next likeliest word in a sequence, within a given context. An LLM like ChatGPT can be used out of the box with a web browser prompt or programmatically (via API) by developers.

LLMs can be thought of as being much more of generalist than ML models. Their linguistic ability in natural language processing (NLP) comes from being trained on literally a vast variety of texts. The quantity of training data is referred to as LLM size - large models have billions of parameters (compared to the 4 in our simplified fraud ML example).

Foundational LLMs come pre-trained on next-token prediction and may be instruction tuned with human feedback, but will likely struggle with specific predictive inferences use cases such as our fraud detection example. ?

LLM Implication: Much less mature than ML, LLMs come pre-trained and are more generalist tools that cater for a broad range of use cases.

Am I hallucinating ?

LLMs are capable of creating new language in a realistically human-like way, but can make mistakes if they lack context. They can also lose track of the earlier steps in a conversation. A factually incorrect or nonsensical response from an LLM is referred to as a hallucination. ?This is particularly irksome because the LLM doesn’t know the response is incorrect and will have conviction in its accuracy !

Each LLM will also have a cutoff date of when they were last trained with data. ChatGPT4’s documentation says that its September 2021, meaning it won’t have a grounding of any subsequent world events after this time. Because of hallucinations, an ever expanding knowledge gap (post training cutoff) and the possibility of losing its train of thought, a lack of trust and major ethical concern has understandably built up around GenAI and LLMs. But given only 7% of human communication are words alone (55% body language and 38% tone of voice), don’t we need to cut text-input only LLMs some slack here?

Hallucination Example:

Why is context such a challenge?

This question is easy for most Brits, perhaps more of a stretch for [pick a country of your choice] and a challenge for an LLM. Why? Well because ‘bank’ could mean a geological formation (grassy bank), an aeronautical manoeuvre (a plane banking), as well as a financial institution. If no other hints or background were given, how does the LLM figure out the real intended context?

The answer is semantic search, a technique which uses Natural Language Processing (NLP). Traditional keyword (lexical) search looks for word matches but often struggles if people use alternative terminology or synonyms. Semantic search on the other hand, aims to understand both the intention and context of the original question and is capable of more consistent and effective results.

Semantic search works by converting (or embedding) a chunk of text into vectors, which is a set of multi-dimensional numbers. This numerical representation of text can be easily understood and more importantly sorted by computers. By using a standard algorithm (usefully called nearest neighbor) to find similar numeric vectors (ergo semantically related text) and ranking the results, the likeliest context can be derived.

To achieve the most realistic semantic search results, vectors and vector databases (which makes things fast) are combined with techniques such as Knowledge Graphs. These understand the relationships between real-world entities such as people, places, and products.

LLM’s Summary

·???????? Have linguistic ability to predict words

·???????? Consume a huge amount of data during a lengthy training period

·???????? Training is costly in terms compute costs and employing data science teams

·???????? Can be expensive to use (especially when used programmatically)

·???????? Have a training cutoff date where their knowledge effectively ends

·???????? Can hallucinate and give non-sensical answers as well as lose their train of thought

As with any emerging technology, measuring the ROI will inevitably come into focus. Ever increasing GenAI adoption pressure from visionaries and ambitious sales trajectories has led many businesses to ‘turning on’ some form of GenAI as a productivity tool. But I believe the real question to be addressed is how does any organisation get real business value from GenAI?

In Part.2 we’ll look at getting better results from LLMs, Prompt Engineering, fine-tuning, RAG, AI Agents and Agentic workflow. I’ve created simple visuals to help explain these topics as well as spell out the implications for business.

Thanks for stopping by, I appreciate your time ! If you found this useful or have an alternative perspective, please share or comment. If you'd like to discuss AI Strategy, please DM me.

Aydan Al-Saad

Entrepreneur | Creator | Interviewer. Helping people with personal branding, revenue & content strategy. Fractional CRO, GTM & Startup Advisor & Investor | Founder: Creator Leaders

5 个月

Great overview ????

?? Adam Tilbury-Eld

Co-Founder & Chief Growth Officer @ Ovyo - Managed Resources, Professional Services & Team Augmentation for OTT, Media & Broadcast industries | Rise WIB Alumni & Ally of the Year '23

5 个月

Fantastic read Jonathan! I've been self-learning more and more around AI usage for business (and personal!) and I came across your article, I really appreciate the break down you've given around the rise of GenAI and its challenges, especially around the costs of developing LLM's and the complexity of token-based pricing - its an area not very well understood. Looking forward to Part 2, especially the deep dive into prompt engineering and RAG. Thanks for sharing!??

Eddie Forson

Helping businesses turn AI into their unfair advantage

5 个月

Good overview of Gen AI with a lot of ground covered! As a software developer I find that the more I interact with LLMs and AI Agents, the more I view them not only as a productivity tool but a new type of computer. Natural language (with some prompt engineering dark arts) - and not programming languages - is the primary means of communicating with this semantic computer. This requires a complete paradigm shift for most knowledge work.

Milton Chikere Ezeh

Senior Software Engineer at Capgemini | Java | Spring AI | Generative Ai

5 个月

Insightful article. I agree that while GenAI holds immense potential, it’s still at a very early stage compared to more mature technologies like machine learning. As businesses continue to adopt it, the focus should indeed shift from hype to finding concrete, sustainable value.?

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

5 个月

The framing of GenAI as solely a productivity tool feels reminiscent of early discussions around the internet a novelty with vast potential, but not always clear-cut value. History shows us that true business value emerges when technology transcends its initial purpose. How can we ensure GenAI's integration goes beyond automation and delves into the realm of emergent capabilities, like novel product design or predictive market analysis?

要查看或添加评论,请登录

Jonathan Lanyon的更多文章

社区洞察

其他会员也浏览了