8 Things to Know about AI (Part 1)
Midjourney - Tales of Wonder

8 Things to Know about AI (Part 1)

Imagine if the autocomplete feature on your phone was given the ability to not only predict words but entire stories. Imagine if Word could not only correct your typos but also suggest more engaging ways to convey your message.

We don’t have to imagine - that’s what ChatGPT and its fellow “Large Language Models” are doing. At their core, LLMs are really just VERY fancy autocomplete engines.?

I'm working with my oldest daughter Samantha, a special ed teacher in NYC, on a project involving generative AI. We're in the process of training the model to create a "differentiated" curriculum to better suit her students. While discussing the mechanics of these systems, I found myself frequently using the term "fancy autocomplete engines" to describe them.

This term stuck with me after coming across an online article. Sam Bowman, an Associate Professor of Computer Science at NYU, authored a survey paper earlier this summer titled "Eight Things to Know about Large Language Models".

The paper had some fascinating (and slightly worrying) conclusions. I’m going to cover the first 4 of them today. I’ll cover the second group in a later posting.?

1 - LLMs get better simply by feeding them more resources:?

  • Researchers have found that just giving the AI more data and “compute” (computer processing resources), the LLMs get more capable. They develop new abilities.?
  • Nothing changed in their architecture or coding - simply exposing them to more data and giving them more power - they learn how to do new things.

2- Unexpected Skills Can Develop:

  • As these models evolve, they sometimes develop new abilities that weren't initially anticipated. It's a bit like a surprise box; you know there will be something valuable inside, but you're not quite sure what.
  • GPT-3, the system that really took the world by storm earlier this year, developed a new “trick”. It showed the ability to learn by being shown how to solve a problem. “Chain-of-thought” is basically showing your work when solving a math problem. Using this process, GPT-3 became able to do things it couldn’t do before.
  • The models go from being idiots to being quite capable. They do this suddenly and unpredictably. GPT-3 was basically incapable of doing arithmetic until it hit about 10^22 FLOPs of computation. ?


3 - LLMs Learn to Understand the World:

  • These models are becoming adept at understanding the world in a way that goes beyond just processing text. They start to grasp concepts and ideas, much like how we learn to understand things.
  • The latest LLMs are starting to show that they understand the world, in spite of having never been shown the world.?
  • Models are learning how to play board games in spite of having never seen the board or taught the rules. Simply feeding them lists of game moves (like Chess or Go) they learn how the game looks and how it is played.
  • Models have started building “visions” of the world described in a story and can then describe relationships and build inferences about aspects of the scene that weren’t included in the story.

4 - There are no reliable techniques (yet) to control the behavior of LLMs

  • Guiding these models to behave in a specific way is still a bit challenging. It's like trying to train a new kind of pet that we're still learning about.
  • The AI models can’t be reliably tested to act in a consistent way - due to the way they operate and the infinite variety of scenarios they could be exposed to.
  • Interestingly, the better they get at understanding human language, the easier it’s getting to have them understand the generalizations we use.
  • This leads to a surprising issue called “sycophancy” where the models start flattering their users to reinforce their operators beliefs.

I found these points fascinating! The idea that simply feeding these AI models more data and more computing power gives them the “genius spark” to suddenly do new things is really incredible.

That excitement is tempered by the idea that there’s not a good way to truly control their behavior (at least not yet).?

The next 4 points have some stunning revelations. Feel free to download the full paper for yourself at https://arxiv.org/abs/2304.00612

Stay tuned, as I plan to share the second installment, uncovering further fascinating aspects, in a day or two.

Aaron Ansari

Managing Partner | vCISO | Board of Directors | Public Speaker | Inclusive leader (Ally) | Ironman | Mountain Summiter | Advisor | Investor

1 年

are you using a VectorDB to embed the data?

要查看或添加评论,请登录

Kolby Kappes的更多文章

  • Riding the AI Wave

    Riding the AI Wave

    Riding the AI Wave: How Eliassen Group Became a Generative AI 'Taker' In the world of tech, if you're not surfing the…

    3 条评论
  • Piecing Together Inconsistent Data: A Generative AI Approach

    Piecing Together Inconsistent Data: A Generative AI Approach

    Introduction Ever tried solving a jigsaw puzzle with pieces from different sets? Welcome to the world of data cleaning.…

    5 条评论
  • GenEd: The Challenge and the Classroom

    GenEd: The Challenge and the Classroom

    Introduction In a quest to merge generative AI technology and education, I found an opportunity that hits close to…

    3 条评论
  • 8 Things to Know about AI (Part 2)

    8 Things to Know about AI (Part 2)

    What if the next Mozart or Einstein wasn't a human at all, but a machine? Let's dig into how Large Language Models are…

  • Prompt Engineering - Meet Your New Grad

    Prompt Engineering - Meet Your New Grad

    The more I interact with generative AI systems like ChatGPT, the more they remind me of a fresh college graduate…

    4 条评论

社区洞察

其他会员也浏览了