The Black Box Behind the Curtain: A Simplified Deep-Dive Explaining Generative AI
Photo by Leópold Kristjánsson on Unsplash

The Black Box Behind the Curtain: A Simplified Deep-Dive Explaining Generative AI

Generative AI just won't stop buzzing, and everybody on LinkedIn keeps talking about it. However, it bothers me that so many posts sit on one side or the other of a spectrum. They either share use cases and prompts "11 was to hack your job with ChatGPT!" or talk about things from a deep technical lens "Microsoft just updated this API to harness the graph and leverage adversarial models, check out the hyper-technical spec for more info!"

They all treat it as a given that the reader understands Generative AI, but it seems few truly understand what is happening behind that curtain. By understanding the components and inner workings of Generative AI, we can demystify its magic a bit and come at these articles and posts from a place of better understanding.


Building an AI-Terms Foundation

Before diving in on Generative AI, its good to discuss some terms often mentioned when talking about the practical usage of data and analytics. Each of these concepts plays into many AI solutions, and as we go further the solutions get more complex (and expensive) to develop.

?? Descriptive Analytics: This is like the "rearview mirror" of analytics. Descriptive analytics involves analyzing historical data to gain insights into what has happened in the past. In an insurance context, these can help you understand claims patterns or customer demographics. By leveraging descriptive analytics, insurers can make informed decisions based on historical trends and data-driven observations.

?? Predictive Analytics: Looking ahead, predictive analytics enables people to anticipate future outcomes. By applying statistical models to historical data, predictive analytics identifies patterns and makes predictions about what might happen next. Building on those earlier insurance cases, these can help forecast claim frequencies or estimate policyholders' risk profiles. This enables insurers to proactively manage risks and make more accurate predictions.

?? Machine Learning (ML): Machine learning takes us even further on the path of AI. ML algorithms empower systems to start with a model, but then to learn from data and improve that model’s performance over time. In insurance, machine learning can be utilized for tasks such as fraud detection, automated underwriting, or customer segmentation. By continuously learning from patterns in the data, ML systems become more accurate and efficient over time.

Note: This is a minor aside, but I think it’s important to mention that AI may be overkill for some use cases, as a simpler, less expensive solution that organizes and exposes descriptive analytics may give people the data needed to make better decisions without advanced investments.


Pulling Back the Curtain of Generative AI

With that foundation, let's unveil the captivating world of Generative AI. Simply put, Generative AI combines the knowledge gained from descriptive and predictive analytics with the learning capabilities of machine learning to enable users to create new content of any kind. I'll talk specifically about Large Language Models (LLMs) here, but the same principles generally apply to image generation models as well. LLMs can take a set of input words and predict the most likely next word, ultimately generating coherent relevant sequences of words based on patterns in a massive dataset. So how does it work? That’s both easy to explain and impossible to answer. Let’s look at those earlier terms as a lens to explore LLMs a bit further.

Descriptive analytics might help you understand how often a given word occurs in the English language. It can also showcase occurrence frequencies of two-word pairs, three-word groups, or any other length of word sequences. For example, if you take all the written content on Wikipedia, you can run a report of how often the words “cat” or “in” occur, and how often the word "in" occurs directly after the word "cat" (eg. a word pair). Theoretically, you can go even further and predict, in that dataset, how often the words “the cat in the hat” occurs, or even the entire text content of the Dr. Suess classic.

Now, if you have enough descriptive data like that, you can begin to extrapolate it. Predictive analytics lets you use a formula or a model to do so. Remember that simple equation for a line you learned in high school: Y = mX + b? That allows you to predict any point on the line (Y) given an input (X). In this case, imagine X is a string of 30 words, and Y tells you what the 31st most likely word is to be. With this, you can just keep adding the next word and always predict the next best word given the prior 30 words.

Cool, right?

The trouble there is to figure out what the right model is (ie., what are the “m” and “b” in that equation). It’s tricky, because confidently predicting the likelihood of any 3 words appearing together in a row would require more text than has ever been written (and, to put it further into perspective, likely more than there are particles in our galaxy). The idea of doing it for sets of 30 consecutive words as an input, let alone 3 paragraphs of text, is mind-boggling. So how does Generative AI manage to work? That is where machine learning comes in.

ML allows us to build a model with the data that we do have “on the fly.” An ML program in this context:

  • lets us provide a set of words
  • feeds them to a network of decision nodes (also known as a "neural network" if you want to sound super smart)
  • makes a series of decisions as it bounces through each node
  • ultimately tells us the next predicted word
  • evaluates if it made a good or poor prediction

As an example, it can do that evaluation automatically by testing the calculated output against the actual output for every paragraph (and the next word) that has ever been published and is available online. Then, based on each of those evaluations, it tweaks the rules for the decision nodes in its black box accordingly. Furthermore, every time you use the tool, anyone can also give manual feedback on the output it provides to their prompts, which further trains the model.

You might think those nodes in the network would each represent clear rules like “should it be a noun or verb?” and “should it start with an A?” However, we don’t really have any clear way of interpreting any single rule. The rules are encoded as mathematic formulas which makes them challenging to interpret, and are always evolving based on the feedback. To make things more tricky, we aren’t talking about just a couple of rules in the network; GPT-4, as an example, has 170 trillion such influencing parameters. Imagine a network of that many decision nodes in which they are all constantly evolving – it's truly impossible to understand how it predicts the next word.


Cautions and Notes for LLMs

An additional note: because it does just predict the next best word, it is not actually conducting research or looking up information. This means that Generative AI solutions occasionally string together words that are believable but are not actually true. This is called “hallucinating,” as the machine invented something that seems real, but is not actually so. If you are ever using Generative AI, know that you should always audit it for information that may be “made up” to ensure that you aren’t being accidentally hoodwinked by a model that is accidentally making stuff up.

Also, it’s important to note that in many cases (including using ChatGPT), the nature of giving inputs and getting outputs is informing the model, meaning your data is getting added to the broader data set. This is what happened with Samsung earlier this year, when some developers used ChatGPT to help debug some code, inadvertently adding some of their intellectual property to the model and allowing (with clever prompting) others to extract that source code and IP. My suggestion is to never provide IP or any information that you’d want re-surfaced elsewhere into an actively learning model like ChatGPT. At this point, many organizations have built internal Large Language Models (LLMs) of their own to power their own Generative AI solutions, so IP is less of a concern, but overly personal information is just as risky to provide.


Soo, yeah...

Hopefully this has been a helpful overview, and more explanatory than “10 Cool Prompts for Underwriters!” I am always of the mind that to truly leverage a new tool or skill, a solid understanding of how it works (or as close to one as you can get) is worth far more than a surface-level set of use cases. A lot of this content was informed by this amazing article by Stephen Wolfram, so if you feel inspired to go another level deeper, by all means dig in. Let me know if you have any other questions and I am happy to do my best to help explain further!?

要查看或添加评论,请登录

Timo Loescher的更多文章

社区洞察

其他会员也浏览了