Stochastic Parrots
Created with Midjourney

Stochastic Parrots

In recent years, and especially over the last few months, there has been a growing excitement about the potential of large language models (LLMs) to revolutionize the way we interact with computers. These models are trained on massive datasets of text and code, and they can generate text, translate languages, write different kinds of creative content, and answer questions in an informative way.

However, some people may be ascribing more intelligence to these models than they actually possess. They can generate convincing text, but don't actually understand it. It just seems like a plausible combination of words that somebody could write about a given topic.

Many people are confusing eloquence for intelligence

In some aspects, the way these models are being used is not that different from a spellchecker on steroids. Or a stochastic parrot.

The term "stochastic parrot" was coined by Emily M. Bender, a linguistics professor at the University of Washington in the 2021 paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" by Bender,?Timnit Gebru, Angelina McMillan-Major, and?Margaret Mitchell.

Just like a parrot who is trained to repeat phrases from Hegel or recite equations from Einstein doesn't make the parrot a philosopher nor a physicist, it's important to understand that LLMs are evaluating probabilities of the next word in a sequence. However, this is based on probabilities developed through the training of the model -- that's where the "stochastic" part comes in. LLMs have learned to identify statistical patterns in the text they are trained on.

This past week, I organized an internal hackathon for my team at InOrbit.AI focused on leveling up everyone's knowledge of the latest artificial intelligence tools and applying that knowledge to solving various problems. And by "everyone", I do mean that everyone at the company was invited to participate; the hackathon was not just for our engineering team. This was possible because the latest tools have made AI much more accessible, so we had our business team actively driving various initiatives alongside our product-focused team.

While many people dove straight into generative AI, including LLM and diffusion models, others worked on what can now be considered more "traditional" or "classic" AI techniques, such as computer vision and machine learning (ML). The differences and similarities between these two approaches, both of which are under the AI umbrella, are worth noting.

For classic ML models, such as a classifier, the precision-recall curve dictates the accuracy of predictions. This is then use to adjust the threshold based on the trade-off between true/false positives and true/false negatives. This explicitly recognizes that the predictive power of the model imperfect. Moreover, a classifier, such as detecting a person in a video stream, will typically output a probability, which can be used to make informed decisions. These decisions depend on the specific use case and the impact of getting the wrong result.

LLMs also rely on probabilities to construct a sequence of tokens. However, those are not visible to the user, and in fact given the complexity of building long sequences, it would be very difficult to get full explainability. This results in users considering all output from the LLM as "correct" or "accurate", not knowing in which cases the model just filled in some arbitrary data or content.

This is often referred to as hallucinations. However, as catchy as this might be, I believe this term is misleading. Hallucinations imply that there is a mind behind the perception. According to the UK National Health Service (NHS), "hallucinations are where you hear, see, smell, taste or feel things that appear to be real but only exist in your mind." And as we've already established, LLMs are mindless content generators.

Perhaps a better way to describe this as fake content. We already have an over-abundance of (human-generate) fake news. Top scientists have recently been caught using fake data in their studies, some done decades ago. The advent of LLMs is only going to exacerbate that.

As an example, during the recent hackathon, the marketing team was trying to understand how popular chatbots might describe InOrbit to our customers. They all did a reasonably good job presenting relevant information (based on information we have published over the years on inorbit.ai), but one of them added a couple of co-founders I never had and listed a few companies as customers I wish we had.

While it's unquestionable that generative AI will add significant value for companies, such as enabling translation of documentation to new languages, it's also important to be aware of the limitations. The output of large language models like ChatGPT or Bard requires human supervision and fact-checking

We have been trained for a couple of decades to search something online, evaluate the trustworthiness of the source, perhaps search some more for alternative views. While this is far from perfect and has led to rampant misinformation, it would be an even bigger mistake to blindly rely on LLMs. As a society, we need to develop the best practices to avoid that.

After all, you wouldn't trust important decisions to a parrot that only gets it right some of the time, would you?




Ayush Kumar

AI Product Manager | IIMB MBA '23 | IITT CSE '20

1 年

"LLMs are stochastic parrots" Aren't we all? ??

Just to build on this, this is an interesting article in the NYT about Google applying LLM's to robotics https://www.nytimes.com/2023/07/28/technology/google-robots-ai.html

Chris Stergiou

Adding Value to End Users and Suppliers with Practical Automation- Let's Discuss your Project

1 年

Well ... is there any evidence that a Parrot randomly joining "credible and seemingly authoritative content" would generate corporate strategy significantly different from that generated by the current crop of human C-Suite executives? ??

As you know, In Actuator, Brian Heater discussed a publicity stunt about a company making a robot the CEO. I believe this was for a period of time, not forever. He was not supportive of the idea and knew it was just a stunt. Anyone who knows robotics and AI, knows this isn't a clever idea even if it was meant as a joke. To me it's actually a bit scary. People have to understand that these machines are not thinking. Some programming underneath the covers tells it what character to write next, or what to say next. The wrong information is often peppered in so it needs fact checking and human help.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了