How Geoff Hinton changed my mind about AI risk
We've all heard of Geoff Hinton recently and publicly leaving Google Brain to bring awareness to A.I. risk. Something lost in that hoopla is the specific moment when he changed his mind about AI and the implications that creates. But first, let’s establish what Geoff Hinton has been trying to figure out in the first place.
Learning about human learning
Geoff Hinton has been trying to understand the human brain for decades. It turned out that artificial neural networks were a promising tool for studying how the brain learns. While relatively little progress has been made in understanding how the human brain learns, an artificial learning algorithm called Backpropagation has unlocked Machine Learning and now Large Language Models to make incredible progress in the last 20 years.
Up till now, it's been universally accepted among the AI community that humans are much more efficient at learning than AI. The classic example is that human brains require far fewer samples of any given training input to produce good predictions. Indeed, this is the case for learning the alphabet or reading comprehension. It has been estimated that toddlers need something between hundreds to a few thousand exposures to various letters in different contexts. In comparison, the benchmark MNIST dataset contains 60,000 images of handwritten letters. Traditionally, this has been used to deduce that human brains have a higher ability to generalize from rich multisensory inputs.
Hinton claims he’s held and defended this position of superior human learning in search of something better than Backpropagation. This is a logical approach since we can be sure this is not how the human brain works. The brain only sends electrical signals through its fleshy neurons in one direction, forward. We can measure this experimentally. By definition, Backpropagation implies propagating the prediction error backward in the network, and that is known to be physically impossible in the human brain due to the lack of any signals going back through the network.
If you want to know what Backpropagation actually does, and how it works, here’s a good visual breakdown.
Fun fact, Backpropagation was invented by a Finnish mathematician named Seppo Linnainmaa in his Master’s Thesis already in 1970!
From human learning to superhuman learning
So what changed Hinton’s mind? He said his “aha” moment was when Google’s PaLM model was able to explain jokes, a test he himself had thought of as a true sign of intelligence. Further, ChatGPT and GPT-4 have demonstrated factual knowledge about the world far beyond what any single human could replicate. It’s now well reported that GPT-4 is beating humans across many human measures of intelligence and professional examinations in knowledge work tasks ranging from coding to law.
This is important because, despite the staggering compute cost running into the hundreds of millions, even GPT-4 is around 1% of the overall complexity of the human brain if we just count the available parameters (the weights that represent the relative strength of synapses between neurons). For a while, it was speculated that GPT-4 had actually surpassed 100 trillion parameters, which would be on par with the human brain, but the leaked figures are now putting GPT-4 at around 1.8 trillion. For reference, the original ChatGPT version was using GPT-3 which is a mere 175 billion parameters. So going from 0.1% to 1% of the brain’s size seems to have done a lot. When do we get to 100%? Let’s ask GPT-4 with Code Interpreter.
If we stick to the facts, first, then this is what progress looks like on logarithmic scale. There’s no point without log scale, because you wouldn’t even see GPT-1 and 2. Suffice to say we’re moving an order of magnitude between generations, and we can assume that to continue. Let’s ask GPT-4 to figure it out.
Interestingly, it gives humans a pretty high parameter estimate. GPT-4 says “The human brain has approximately 86 billion neurons, and each neuron connects to, on average, 7000 other neurons. So, if we consider each connection (synapse) equivalent to a parameter, we get approximately 86 × 10^9 × 7000 parameters.”
Somehow, using Backpropagation, the so-called Transformer models underlying the Large Language Models are able to pack incredible amounts of information into a volume of information storage that is a fraction of our brains. In simple terms, it turns out AI is able to do a lot more with a lot less. Even if not as efficient with input data, the architecture and technology behind neural networks allow us to “force-feed” more total learning per neuron than what gooey human wetware allows. Maybe it’s better compression, or maybe there are other biological or evolutionary reasons to limit the scope of our intellectual capacity and memory. Unlike server farms where we can tap into virtually unlimited energy, humans have evolved to be extremely sparing in how much energy our brain saps to simply avoid starving to death.
From superhuman facts to superhuman understanding
Okay, so we now concede that AI is better at learning than humans. But we know that Large Language Models are just stupid “stochastic parrots”. It’s just next-token prediction based on rote memorization of the internet. It doesn’t actually understand what it’s doing. Right?
Yeah, we’re not so sure about that either. In fact, Geoff Hinton and many other pioneers in the space suspect that next-token prediction is in fact bootstrapping much more complicated capabilities than mere memory. For example, you can play games with GPT-4, and we know that playing games like chess require a rudimentary world model. Chess is played on a two-dimensional board by humans, but GPT-4 may be “seeing” the board in some completely alien construct such as lists of letters that would read as gibberish to humans. Kind of like… DNA. Kind of a scary thought.
More broadly, it is quite possible that at this scale, next-token prediction actually requires a deep understanding of the topic you’re predicting. So no, GPT-4 is not a stochastic parrot. One of the many challenges is that we don’t have the capability to dissect current LLMs. Inside it’s just numbers. Somehow, very precisely chosen weights for a network of artificial neurons create intelligence from seemingly out of nowhere. Even superhuman levels of intelligence. The task of working out what different neurons, layers, and regions of the network are up to is daunting. It’s quite likely that given the economic incentives are currently aligned with higher capability, not higher understanding, that the investigative work will remain woefully behind the latest frontier models.
From superhuman understanding to existential risk
Therefore, in some sense, we’re already loosening our grip on the beast. We don’t really understand any Large Language Models in any meaningful way, even the smaller ones. There is a whole field of “Mechanistic Interpretability” and “AI Alignment” research that is rowing against the stream. But that current is picking up exponentially. Even the companies founded on the premise of AI safety, such as Anthropic, are falling into the economic trap by raising?billions of fresh capital?to beat OpenAI. Someone has to create GPT-5, right? So why not us? We’ll do it better, and be more responsible, as long as we do it first. Yeah, that reasoning has issues.
领英推荐
The fact is that we, as members of a global capitalist society, simply care more about progress and productivity than safety. This is why we have the atomic bomb. We do gain-of-function research because we need to gain those lucrative functions. We can’t not have the functions. When looking over the abyss, humankind has proven itself incapable of simply walking away. Instead, the abyss calls us to dive into the darkness headfirst, eyes closed, hoping for a soft pillowy landing instead of gaping jaws and razor-sharp teeth.
Lots of people are waving their hands and crying wolf, not just Geoff Hinton. Roman Yampolskiy and Eliezer Yudkowsky have been trying for nearly as long as Geoff Hinton has been doing Deep Learning. Jaan Tallinn, of Skype fame, gives us between a 1 and 50% chance of ending humanity with each new generation. Forget the 50% for the moment, just focus on the 1%.
When I asked GPT-4 with Code Interpreter to plot what 1% risk per generation looks like, it came out with this.
What’s quite ironic is that GPT-4, all by itself, added an assumption of doubling the risk per generation. So we start at 1% at GPT-3, but now exponentially approach certain disaster by the tenth generation. Remember from our previous exercise, we estimated that GPT will achieve human brain equivalence by GPT-7, if not sooner.
Makes you think how long that might take, suddenly. Let’s ask GPT-4 for another plot.
So there’s some good news, finally. Due to the exponential cost of training the next generation of models, and the related research and safety work probably similarly increasing in effort, things seem to be slowing down. Let’s continue the plot to find out how much time we have left on the planet.
Right, so roughly 10 years until superhuman intelligence. Oh, and let’s make New Year’s Eve 2040 extra special. Better get working on that bucket list, people.
Then again, the releases seem to proceed pretty linearly from GPT-4 onwards, whereas the slowdown was exponential until then. Hmm, maybe it knows something we don’t. However, when I forced GPT-4 to strictly follow the path from GPT to GPT-4, it gave us some more breathing room.
Phew. We can live the rest of our days in peace. Fingers crossed! Either way, both versions are giving us GPT-5 by 2026. Right now, that doesn’t seem crazy. We know it’s hard, but we know it’s the most valuable technology ever known to man and the smartest people on Earth are in a race to be first.
Then again, we started at 1% risk doubling per generation. Some might consider that rather conservative, given Jaan Tallin’s range went up to 50% per generation. If any of this is even in the ballpark, then we’re involuntarily but seemingly willingly participating in a real game of Russian Roulette involving the entire human species.
Even Yuval Harari?picked up on the fact?that LLMs are learning (or even hacking) our operating system. If next-token prediction requires deep understanding, then it will deeply understand humans by definition. Remember that sources like Twitter and Reddit are part of the training data for GPT-4. It knows. How we think emotionally and irrationally, how we work biologically, and how we collaborate socially and economically. If that is true of GPT-4, what can be said of GPT-5 and beyond?
Probably nothing. That’s how the singularity works. The great curtain behind which an uncertain future lies. No peaking. We find out when we open the curtain.
We can just turn it off, right?
Wrong. I’m not going to engage in the debate of whether AI Safety is trivial or not. Eliezer Yudkowsky?has done that work. The short answer is that it is far from trivial. In fact, dealing with an intelligence that is smarter than humans implies that it will just outsmart us no matter what we try because it is smarter.
Perhaps now you understand the desperation and urgency of the proposed 6-month moratorium on giant AI experiments. I signed, but Geoff Hinton did not. He claims it was futile as an exercise, but many signatories of similar stature say the statement was important to make regardless. It was a wake-up call.
My personal hope is that whatever calamity causes us to enforce real regulations, far beyond the proposed EU AI Act, doesn’t cause much suffering and can be reversed. That’s probably the best-case scenario.
We can already change the world and multiply human productivity with GPT-4 and its derivatives. It’s completely game-changing, and since we survived the first few months we should assume it’s safe enough to use. Why risk it? We could choose to just make more efficient and cheaper versions of GPT-4 or smaller models, and focus our energies on AI Alignment work.
In my humble view, a safe and aligned version of GPT-4 would be far more valuable than rolling the dice on GPT-5, given the stakes literally couldn’t be higher.
What do you think?