LLMs all the way to AGI? Maybe but maybe not
It’s perhaps interesting that the field of artificial intelligence has always gone on without trying too hard to define what intelligence is. And that’s probably perfectly fine as the field is poorly named anyway. I think it should simply be called computer science. I’m not sure there is anything that connects all of it so closely to even call it a coherent subfield of computer science. I’ve always thought a good definition is that artificial intelligence is the field of study concerned with building algorithms that allow computers to do what they currently cannot.
But anyways… If we are going to talk about AGI (Artificial General Intelligence), we do have to dig into this a bit more. If intelligence isn’t defined, we certainly can’t define AGI exactly. But people tend to use the definition that: an machine that can do any intellectual task that a human can do. Thus, it is defined behaviorally. At least that is something measurable.
Obviously we aren’t there yet and I’m not sure we would want such a thing or that it won’t kill us all. But also consider that since the dawn of the electronic computer, if not way before, we have had machines that can do some thinking tasks better than humans. For example, counting, storing/retrieving information, playing chess etc. For each of these advances people were amazed and thought that it was a great advance towards human level thinking; that AGI was just around the corner.
But usually we look back with less amazement. Such things look simpler and cruder; we can see what they can do and what their limitations are. We usually begin to see that the task is really just a computation. The problem they were trying to solve just looks a lot easier to us.?We recognize that there is still a long way to go from that to an AGI.
This time around with LLMs, we probably are making the same mistake we always do. Yes, they are amazing and yes, they are useful, but counting was amazing and useful as well but it wasn’t an AGI. And it didn’t matter how much faster they were at counting and how much more accurate they were. It was just counting and computers could not, for example, answer a written question (like they can do now).
I and others have written about all the great things that LLMs can do and they will certainly be useful. But that begs the question, what can’t they do? No one really knows the answer to that question. Sure, there are lots of things ChatGPT can’t do but people would argue that we are just scratching the surface of what LLMs can do and they will get bigger, faster and better as we continue to innovate and continue to train them on more data.
But the idea that doing these things is going to launch LLMs to AGI is nothing more than an assumption that some people make. No one seems to know whether LLMs have fundamental limitations that prevent that advance to AGI. And we can’t really even try to make that argument without specifying exactly what we mean by LLMs.
领英推荐
An analogy I like to make is that we all know that we can’t drive a car to the moon. Right? But if we are willing to open up the definition of car to include a three-stage chemical rocket, then we have shown that you can. But you see, this then just becomes a game of words. Without defining exactly what an LLM is, and what it is not, we can’t even begin to do that.
For a closer example, consider the perceptron, also known as logistic regression. These are the base components of today’s neural networks. But when they were discovered, people also thought they might be able to be general learners which, for all we knew, could become an AGI. Marvin Minsky however came along and proved that they could only learn linear separable problems and could not learn, for example, the XOR function. This was quite a let down. But then people came along and created the multi-layer perceptron, today’s neural network, and then Minsky’s argument was off.
The lesson there is that you can prove that certain things have their limits and then you have to go and add something new to go beyond it. We don’t know that what we now call LLMs are not close to one of those boundaries, and, if it is, how much work will be required to get beyond it. And by work, I don’t mean throwing more electricity at it. There is no guarantee that that gets you over the next hump just as it wouldn’t do so for perceptrons. Throwing more data and electricity at it is indeed mostly what moved neural networks forward over the past 30 years but that doesn't mean that trend will continue forever.
So, LLMs might not lead to AGI and, for all we know, we could be about to hit another wall and then a major slowdown on the road to AGI. I’m not saying we definitely are. Just saying we might be. Let’s keep working on other improvements (and hope they don’t end the World)? but let’s also have some healthy skepticism and humility and not get carried away irrationally. Remember in the future we might be very embarrassed if we do.
I suspect someone clever might come along soon and prove some kind of theorem about limits of LLMs; like what Minsky did for perceptrons. Perhaps they will show that LLMs can only combine knowledge and therefore only solve problems that can be solved by knowledge synthesis.
I liken this to knowledge being a bunch of points in some space and LLMs being algorithms that can form convex combinations of those points. If this is true however, they will be limited to the convex hull of human knowledge. They won’t be able to intelligently expand outward, the thing humans seem to be able to do. Perhaps we will show they are forever limited in their creativity. I think I mean this metaphorically but if convex hulls are indeed the root of the proof, remember that you read it here first.