Colab Notebooks: Neural Language Processing and GPT-2

Colab Notebooks: Neural Language Processing and GPT-2

GPT-2.

This language Model, released by OpenAI during the year 2019 is trained on 40 GB text from various sources. There are several GPT-2 Colab notebooks, which work in a similar way: you enter the beginning of the sentence, and GPT-2 continues (or you ask questions to provided texts). The transformer-driven model works with “self-attention”, paying attention to text parts in specified proximity, which allows generating coherent stories, instead of gibberish chaos.

I prefer two GPT-2 notebooks:

Max Woolf’s Notebook allows:

  • to generate various texts by GPT-2
  • to train your own texts (up to 355m Model)

I did it in three languages:

  1. English (on “Alice in Wonderland”)
  2. German (on “Faust I” by Goethe)
  3. Russian (on early poetry by Pushkin)
No alt text provided for this image

As you see, it works to some degree for all languages. Of course, GPT-2 is trained on English sources. For foreign languages, we should apply finetuning and other assets, but this proof of concept was convincing for me. With some interesting observations:

  • The more I trained German on Faust, the closer to original the texts became. The reason is probably in a small dataset (just one single text). If you want to train on your texts, provide wider data amounts.
  • Russian Texts are not really comprehensible, but you can nevertheless recognize the style and even form by Pushkin's poetry. And the coinages and neologisms are perfect, every literary Avant-gardist would be proud of such inventions.

“GPT-2 with Javascript Interface”-Notebook allows:

Text generation, not more, not less. But you can control the text length (which is a very relevant factor):

No alt text provided for this image

With Temperature and top_k you can modify the randomness, repeatedness, and “weirdness” of the text.

With Generate how much you can generate longer texts (I am using the value of 1000).

Links:

You also can use the web-implementation of GPT-2 by Adam King:

I asked this application about the meaning of life. The answer was very wise and mature.

No alt text provided for this image

Wisely, indeed! (Screenshot of TalkToTransformer.com by: Merzmensch)

Read also:

Index of Series "Google Colab Notebook".

Full essay "12 Colab Notebooks that matter"


要查看或添加评论,请登录

社区洞察

其他会员也浏览了