SanGuo GPT - Update 9/17/2023
Not much update recently, just a few minor changes.
I started with perplexity score in training, but it turned out to be just the exp() of cross entropy loss. Not much useful as I'm already logging cross entropy losses.
Perplexity in generation is more interesting:
perplexity = exp(-1/N sum(log(p(w_i|w_1,...,w_{i-1}))))
So basically when I generate the text, I accumulate the log() of each generated token's probability, take a negative average at the end and do an exp().
Now it can report the perplexity after generating a sequence:
Loading model from checkpoints/sanguogpt-v0.1.pth
Loading token map file from c2i.json and i2c.json
Using mps device
30 tokens generated in 4.497 seconds, avg 6.671 tokens/sec.
Perplexity of generation: 3.5431
禅位汉统恭王寻张鲁行七十余营
却说晋王司马炎奔入宫赴曹
I added code in training loop to log the embedding tables periodically, so that I can see the progress along the way. I then project the embeddings onto 3-D space and visualize it in TensorBoard.
I don't really see much difference between the two embedding tables - they both look pretty random to me. Maybe it's because I only tried 1000 steps?
Code repo is here, feel free to play with it: https://github.com/zhoupingjay/sanguo-gpt