Exploring Consciousness and Multi-Sensory Inputs in AI
Benjamin Weiss
Product Management Leadership. Helping companies grow and transform using Digital and AI solutions.
Last week, I explored several exciting developments in AI, anticipating the next major advances that seem to be just around the corner.
This week, let's delve into two significant opportunity areas for breakthroughs in general intelligence: consciousness and multi-sensory input, realms where humans still outperform even the most advanced models. The first area is consciousness, and the second, multi-sensory input.
In a recent conversation, I likened OpenAI's GPT-4 and Google's Gemini Ultra to "brilliant babies" because, despite their sophisticated understanding of the world, they exhibit naive errors reminiscent of a child's thinking process.
Recursion as Consciousness
The concept of consciousness has been debated for millennia. I don’t want to get into that history. Here, I’ll simply propose that consciousness might be simulated in Large Language Models (LLMs) through recursion.
Humans have the capacity to reflect on their thoughts before expressing them. This internal dialogue, I’d argue, is a cornerstone, or perhaps the cornerstone, of consciousness. This process can, to an extent, be replicated in LLMs and their applications. For example, rudimentary versions like AutoGPT have begun to incorporate recursive functionalities, leading way to new capabilities, like basic planning.
领英推荐
Why is this significant? The ability to internally debate, identify logical flaws, and refine thoughts is crucial for intelligent inference and error correction. Currently, models lack this self-reflective capacity, often resulting in inaccuracies or inconsistencies in their outputs. Try asking ChatGPT or Gemini to write something in exactly twenty words, and you’ll see what I mean. If AI models and applications could natively support recursive thinking, allowing them to evaluate and refine their inferences independently (today, it’s us humans playing that little voice in their head through the next prompt!), we may witness a leap towards more profound levels of intelligence.
Multi-Sensory Inputs
While language is vital for encoding ideas, our senses—sight, sound, touch, smell, and taste—play a fundamental role in learning and interaction. Most existing LLMs operate without these sensory inputs, making them more than just blind and deaf. However, with models like GPT-4 and Gemini incorporating visual capabilities, we're beginning to see the profound impact of additional sensory data on AI intelligence (and interestingly, the intelligence benefits are not just limited to prompts utilizing those input types). Imagine the possibilities as AI begins to integrate a broader spectrum of sensory inputs, including non-human ones like environmental sensors for pressure, temperature, and humidity (and that’s just the start).
The evolution of electrical engineering has given us thousands of sensors capable of providing continuous, real-time data, potentially offering AI models sensory abilities beyond human capabilities. This sensory enhancement could enable AI to make more informed decisions about the world. This is where I think things start to get REALLY interesting (and, yes, dangerous too).
We’re not just limited to model input here — on the output front, while AI has become adept at generating textual content, coding, and even creating images and videos (see Sora if you haven’t already), it lacks the ability to translate inferences into even the most basic physical actions (like walking, which isn’t so simple when you start thinking about it). The intersection of AI and robotics is a burgeoning field, with research focused on developing AI-controlled mechanical robots—a promising yet daunting frontier. We’ll save this one for a future article :)
I strongly believe that investments in recursion (consciousness) and multi-sensory experiences will usher in an even more transformative era in AI technology and automation. These advancements promise not just enhancements in AI's cognitive abilities but also, perhaps, a deeper understanding of our own consciousness and sensory perception (think on that for a moment!). As we continue to explore these realms, the potential for AI to not only mimic but also extend human capabilities becomes increasingly possible.