Did AI write this or Did I?
Remember back in school when we all hated doing our homework? Or in college, when we used to spend hours at end writing (debatable) those assignments? At some point of time, I’m sure most of us dreamed about a machine which would do it for us. And in pure Sci-Fi style, that fiction might be coming true!
OpenAI launched the GPT-3 language model in beta last month, and it has been capturing the imagination of tech buffs, investors and the general public alike. I have been baffled by the complexity of it, as well as the simplicity and elegance with which it manages to do amazing things. I believe everyone interested in tech should have a minimal understanding of this and I hope I’ll help.
What is GPT-3?
GPT stands for Generative Pre-Training. The GPT-3 is the 3rd generation of the language model by OpenAI. It’s predecessor, the GPT-2 was released last year, which they chose not to release widely because they thought it was too dangerous.
Language Models ingest large amounts of texts and identify implicit patterns in them which are not apparent to the human mind. They learn those patterns, and then proceed to apply those learnings to newly fed data.
GPT-3 has ingested almost all information on the internet. It has been trained on approximately half-a-trillion words, ranging from philosophical writings and coding tutorials to memes and Wikipedia articles. From that training, it has developed 175 billion parameters.
Photo on Unsplash
Parameters in a model are like neurons of the brain. They are different rules that are set depending on specific conditions, using which the model learns the relations between data points. Parameters are used to measure the complexity of a model, and by extension, the accuracy of its predictions. As a comparison, its predecessor, the GPT-2 had 1.5 billion parameters (that’s a 100x increase in parameters!).
Credits: Towards Data Science & Moiz Saifee
What can GPT-3 do?
Now comes the amazing part. GPT-3 is task-agnostic and requires minimal fine-tuning (trial-and-error adjustment to get the best results). This means that it can perform a huge variety of tasks, and it doesn’t need to be trained a lot. With minimal training, GPT-3 was able to write creative fiction, write working code, compose business memos, create its own jokes with wit and sarcasm and create memes and tweets. Its possible use cases are probably limited only by our imaginations.
Photo by Alexander Sinn on Unsplash
How does it do it? Well, under the hood, GPT-3 is an extremely sophisticated text predictor. A human first gives it some text as input. Then the model evaluates and spits out the most “statistically plausible” output, based on all of the Internet’s information that it has ingested. It guesses the best response to the text provided, and then repeats this process over and over again, treating the generated text as input for the next iteration. And this is where the origin of its shortcomings lie…
Is it really that smart?
While the capabilities of GPT-3 are mind-boggling, it does have some shortcomings, as indicated by a tweet from Sam Altman, CEO of OpenAI. As the text generation relies upon (statistically) the best guess, GPT-3 does not have a mental model of the world. It does not have true common-sense, and it cannot be said to understand what it is generating. Consider this exchange by Kevin Lacker, who subjected GPT-3 to a Turing Test:
Clearly, GPT-3 has a hard time answering questions that would not be asked in a normal human conversation.
Due to its word-by-word approach, the GPT-3 also struggles to maintain coherence in long passages. It can wander off topic and can sometimes contradict itself.
There is no doubt at all that GPT-3 is a significant technological advancement, orders of magnitude better than any of its competition. It does not memorize sentences of text to reproduce them later. It identifies mathematical patterns in our language and answers questions based on it. And it does it with better grammar than most of us.
That being said, it does not possess general intelligence. It does not know if the questions it is being asked make sense or not. And while the versatility of the model is a huge step towards it, the model in its current state is not sentient. It does however, open a lot of avenues for tech entrepreneurs to make innovative use-cases and products using the powerful technology of GPT-3.
If I got anything wrong, please send me a message and I'll correct it. ??
Consulting Executive
4 年Thanks for sharing this Varun ??