AI, Where it All Began, Sort Of: N-Gram Models
Unless you've been living under a rock (sometimes, I wish I was), you are likely aware that Natural Language Processing (NLP) has experienced remarkable evolution over the past few decades, and I wanted to briefly cover this journey.
What began with basic methods like n-grams has now advanced to the powerful Large Language Models (LLMs) we use today. This journey highlights the rapid strides made in artificial intelligence and machine learning.
For us entrepreneurs, grasping this evolution is essential, as it sheds light on both the strengths and constraints of modern NLP tools. In this series, My goal is to cover the progression of NLP techniques and examine their real-world applications in as accessible way as possible.
We'll start with what's referred to as n-gram models.
N-Gram Models: The Secret Sauce Behind Smarter AI
If you've ever typed a message on your phone and had it predict the next word before you even finished your sentence, you’ve experienced the magic of an n-gram model! While it might sound like a mysterious term from a sci-fi novel, n-gram models are actually one of the oldest and simplest tricks in the world of artificial intelligence (AI) language processing.
So, What’s an N-Gram Anyway?
An n-gram is just a fancy way of saying “a sequence of words.” The "n" refers to how many words are in that sequence.
So, a 1-gram is a single word, a 2-gram is two words together, a 3-gram is three words, and so on. It's a bit like playing with word Legos, stacking them to create sentences and predict the next word in a sequence using probabilities.
Let’s break it down:
1-Gram (Unigram): These are the building blocks - the single words. For example:
2-Gram (Bigram): Now we start linking words together:
3-Gram (Trigram): Even more context with more words, you get the idea.
See how it works? The more grams you use, the more context your AI gets. And that’s important because, in real life, we don’t just communicate using isolated words - we use phrases and sentences that give meaning and context.
领英推荐
How Do N-Grams Work in AI?
In our language, certain words are much more likely to follow other words that some others. If I say "Sky is...", you are much more likely to guess I'll follow up with "blue", or "the limit" instead of "battleship".
N-gram models are a way for AI to predict what comes next in a sequence of words. These models analyze large amounts of training text (say, billions of words from a digital book or whitepaper) and figure out which sequences of words are most likely to appear together.
This training data is called a corpus, by the way.
In a bi-gram model, the AI would consider only the last word in a sentence to predict the next one; in a trigram model, the previous two words together, and so on.
Imagine this scenario: you’re texting a friend, and you type "I love my". A simple AI, using a trigram model, might look at past conversations and predict that you’re likely to type “dog,” “cat,” or “job” based on what often comes after "love my" (2 words). The more grams the AI has, the more accurate (deterministic) its predictions become, but also the less creative.
The Power of N-Grams in Real Life
N-gram models have been around for decades, and the principle is still behind some of what makes AI language models work today.
Whether it’s for autocorrect, next-word prediction in simple text apps, or even helping virtual assistants like Siri understand what you're saying, n-grams are (were?) part of the AI backbone.
N-gram models have proven fairly decent for tasks like:
The Limitations
...are many.
While n-gram models are super handy, they do have their limitations. For one, they can only consider a limited context. So, a trigram model can predict the next word based on previous two words, but it can't understand a broader sentence, who you're referring to by "they" in that sentence, and so on. This was a major limitation with early chatbots like Alice.
To get a better grasp on context, we need to turn to neural networks and other advanced AI methods - stay tuned for the next post in this series.
Wrapping Up
N-grams might sound a bit dry at first, but they were the building blocks of many everyday AI systems. So the next time you type a message and get a helpful suggestion, remember: it’s all thanks to the humble n-gram!
Stay tuned for the next post in this series; we'll talk about neural networks, which marked a significant shift in NLP.
I help startups build a full-code Minimum Viable Product in 90 days | Providing instant software engineers with a product mindset | Saving 3x time on development & tech hiring
2 个月That's a great way to simplify n-gram models for everyone. Your explanation, especially the "word Legos" analogy, makes it easy to grasp the concept.