课程: Hands-On AI: Build a Generative Language Model from Scratch

Markov chains and Python

- [Instructor] Large language models are able to look at multiple tokens as their input and produce pretty remarkable results. Our model is only going to concern itself with one word, that is the last word that it's on. If it receives the input, "I try to learn something new," our model is only going to concern itself with the word or token, new. During its training, our model is going to obtain this chart that maps words to possibilities of the next words. So if we have the word, new, it's going to hold those raffle possibilities every, every, and each. So if we run that raffle, we have a two-third chance of having every and a one-third chance of having each. If the word that comes up is every, then the chart is going to equip us with the possibilities day, day, and single. And then we have a two-third chance of day coming up and a one-third chance of single. Now, in order to represent this chart in code, we'll have some sort of a hash table or dictionary that maps tokens to lists of possibilities. Now, the cool thing about Python is that it has some built-in utilities that can really help us out with this. So let's head over to our code editor and check them out. So here I am in the exercise files in 01>01_02_begin.py. And you can go ahead and follow along using the exercise files with whatever code editor you choose. I'm using VSCode. Now, when you come up to the challenge solution videos of these chapters, you'll be presented with an in-browser code editing environment that you can use with no setup. Now let's go ahead and check out some of these utils, the first one being the default dictionary. And default dictionary receives a callable, something you can call like a function or a method. And that callable produces the default value if there's a missing key. So if you have a regular dictionary and you look up something that's not there, you get a key error. But this one, since we have a list callable, is just going to give us an empty list. Now let's say that I have my graph, and I say graph that word. If I print this out, I'll go ahead and run this, I get an empty list. Another cool thing is that I can go ahead and say that graph dot word dot append. Hi. And now if I print this, I get a list with the word hi. And I can do this again. So this is super helpful if you're building out a chart like the one we're going to build for our model. The next thing I want to take a look at is iteration. And if I go ahead and iterate over these tokens. So for token. This is sort of standard. And the Python loop is very clean, but it's missing a piece of information that often comes in handy, which is which iteration I'm on. What if I want to know what the next iteration is? And for that, I can use this neat utility called enumerate. Now in Python, if I use enumerate, I can have a counter. I like to call it i. And then if print i and token, it'll go 0, 1, 2, 3, and it tells me which count I'm on. So 0 is I, try, to. And this is super helpful if you want to know what the next word is, which we are going to while we train our model. Finally, there's choosing a random word from a list. And for that, if we import random, which we did, we can use random choice. So I can go ahead and say print random choice. And if I give it my tokens, I'll get a pseudo-random token back. Let's do this three times. And you'll notice that I have something, every, and day. And if I do that again, I have something, every, and new, day, every, and new. And this is super helpful while running our raffle. So with these tools in our tool bag, we're just about ready to build our model.

内容