What’s Next in AI? A look at the developments shaping AI’s future
Benjamin Weiss
Product Management Leadership. Helping companies grow and transform using Digital and AI solutions.
It’s almost hard to believe that ChatGPT was only introduced to the world in December 2022. A year ago, Nvidia was largely a gaming company whose chips were also popular for cryptocurrency mining. Microsoft was a business software company. Google was a search and media business. You get the point… over the past 14 months, it feels like the AI floodgates have been opened, and we’re laying witness to a rapid pace of technological development where entire businesses are repositioning themselves around this exciting technology.
Now, the reality is that much of this innovation has been in the works for years, and several cases, decades. The explosion we all feel, in many ways, is a function of timing — where the combination of accelerated compute capability and scale has met the challenge of implementing many profound software ideas for training intelligent models.
It’s an exciting time, particularly for those willing to adapt to change.
And further change is coming.
The purpose of today’s article is to dive into several of the most exciting areas of active research and development in artificial intelligence — explaining what each is, why it’s important, and to try and give you some concrete examples of what this may look like in the near future, and what it may empower us to do.
Let’s dive in.
1. Scaling Models and Discovering Emergent Behaviors
What is it? Scaling involves enhancing AI models by increasing their size—think more parameters, data, and computational power. Emergent behaviors are complex capabilities that arise from these larger-scale models, which were not explicitly programmed or anticipated by their creators. One of the great breakthrough discoveries in modern AI research is this realization that size matters, and as models become larger and larger (and the compute required to train those models becomes possible) that new cognitive behaviors emerge from increasingly larger neural networks.
Why is it important? As models become larger, they begin to exhibit surprising abilities, such as advanced reasoning, creativity, learning how to code, how to translate across languages. This isn't just about making AI bigger; it's about unlocking new potential that can advance how AI understands and interacts with the world. While today’s frontier models are still just approaching the size and scale of the human brain (in terms of neurons and synapse counts), tomorrow’s models will far exceed that scale. If you think about the cognitive capabilities humans have compare to say, a fish, it’s very possible we’ll discover even higher forms of cognitive behavior in AI models, offering capabilities humans may never actually possess with our slower evolving biological hardware. Human are doing some of the most incredible intelligence bootstrapping the world has ever known, which is an exciting, and scary, thought.
Examples and Uses: Imagine an AI that can invent new scientific theories, or even propose novel solutions to extraordinarily complex challenges like climate change (and clean energy development). These aren't distant dreams but the direction in which scaling and emergent behaviors may be heading, offering a glimpse into a future where AI could become a pivotal partner, maybe even a leader, in human problem-solving. And with unknown levels of intelligence possible ahead, it becomes almost impossible for us (simple) humans to predict what shape or form this might take! Yeah, scary thought there.
2. Building Larger Context Windows
What is it? The “context window” refers to the amount of information an AI model can consider at one time for inference. Larger context windows enable the model to understand and generate responses based on a wider range of information. Think of the context window like “working memory” in your brain.
Why is it important? This development is crucial for creating AI that can engage in more meaningful, longer conversations, comprehend lengthy documents, and make connections between disparate pieces of information. It's about depth and coherence over longer stretches of interaction or content. Today, it’s often the case that a model must undergo fine tuning to train on a large corpus of new knowledge. But as context windows grow, it’ll be possible to use powerful foundational models with large context windows to generate inferences in seconds, rather than retraining new models over weeks and months.
Examples and Uses: Consider an AI that can read and analyze entire books, volumes, genres in one go, or maintain a coherent and contextually relevant conversation over weeks and months. Entire application codebases will soon be able to live within many models’ context windows, enabling the model to make more informed decisions about how to augment, refactor, write unit tests, and so much more that’s highly relevant and tailored to that one, specific, application.
3. Multimodality
What is it? Multimodal AI models can understand and generate content across different forms of media, such as text, images, audio, and video, integrating and interpreting information from multiple sources simultaneously.
Why is it important? This approach mirrors human cognition more closely, enabling AI to interact with the world in a more holistic manner. It opens up possibilities for richer, more intuitive interfaces and applications where AI can understand context and nuances across different types of data. Think about it this way… as humans our understanding of the world doesn’t come just from the things we read, it’s the combination of what we see, hear, smell, touch, and read. Multimodal models are learning this way too, and the combination of these input types leads to more profound learning and understanding.
Examples and Uses: Already you can ask today’s multimodal models like GPT4 and Gemini to create logos, make an image to accompany an article (hint, hint), analyze photographs and more using their native multimodal capabilities. But imagine an AI that can create a documentary by analyzing text-based historical records, photographs, and audio interviews, or one that can diagnose medical conditions by considering a patient's verbal descriptions, X-rays, and lab results together. Multimodality promises a future where AI can assist in creative (generative) and analytical tasks that combine multiple forms of data, not just text.
4. Search and Planning
领英推荐
What is it? This involves AI models that can perform complex reasoning and decision-making tasks by simulating a "chain of thought"—essentially, thinking through a problem step by step, much like a human would, considering multiple options, and weighing those options based on some understanding of their likely probabilities of success.
Why is it important? It represents a leap towards AI systems that can understand and solve complex problems in a structured and logical manner, paving the way for more sophisticated decision-making capabilities. This is one of the areas where even with today’s best models, humans still have a significant leg up. We can simulate entire chains of events in our head in ways that the models can’t. But that’s changing, and with even more compute coming over the next 18 months, expect to see rapid innovations emerge enabling models to recursively evaluate their own outputs, perform more sophisticated critical thinking, and start planning intelligently.
Examples and Uses: Imagine an AI assistant that can help plan a complex project or event, anticipating and solving problems before they arise. This capability could significantly enhance the effectiveness of AI in strategic planning and problem-solving roles. Where today’s models are useful for small, individual, tasks in our every day jobs, tomorrow’s models will have the cognitive skill to perform more sophisticated aspects of our job requiring multiple tasks to be completed, multiple options to be critically evaluated, and more.
5. Sensors and Robotics
What is it? The integration of AI with sensors and robotics involves the physical embodiment of AI in the world, allowing it to perceive and interact with its environment directly. Touch, sight, sound, even smell.
Why is it important? Anything we can convert from a physical signal to a digital one by way of sensors will become new inputs to tomorrow’s models, giving them more signals to make judgments from. And because humans are limited to our biological sensors (hands, nose, etc.) tomorrow’s AI models will likely have hundreds of analog sensors at their disposal to make more informed and reasoned judgments than us humans can. What we’re really talking about here is? moving AI out of the purely virtual domain and into tangible, real-world applications.
Examples and Uses: Imagine autonomous drones that can navigate and conduct search-and-rescue operations, or robots in manufacturing plants that can adapt to new tasks on the fly by seeing what’s going on immediately around them and sensing everything from subtle changes in temperature, to the over-torquing of a screw. This area promises to bring AI into our physical world in so many new ways.
6. Model Efficiency
What is it? Model efficiency refers to the development of AI algorithms and architectures that are not only powerful but also optimized for energy use and computational resources. This encompasses designing AI systems that can achieve high performance with minimal environmental impact.
Why is it important? The computational demand of state-of-the-art AI models has been doubling approximately every few months, leading to growing concerns about their energy consumption and environmental impact. Making AI models more efficient is crucial for scaling AI technologies. It's about balancing the drive for advanced capabilities with the necessity of environmental stewardship and a basic realization that energy is expensive and finite.
Examples and Uses: As AI models grow in size and complexity, they require more computational power, which can lead to significant energy consumption. Data centers already consume significant amounts of energy (estimated at 2-3% of total US energy consumption), and the GPU compute stack required for AI training and inference often requires 2-4X more energy. Developing more efficient models is crucial for making AI more sustainable and accessible, and private. To that last point, expect to see increasingly sophisticated AI models that can run entirely on your mobile phone. This sort of “edge” AI computing means not having to send all your prompts, which will contain a lot of personal information, off to a data center where it can be used for targeted advertising.
7. Personalization
What is it? Personalization in AI refers to the ability of models to learn from individual user interactions, preferences, and behaviors to tailor responses, recommendations, and services specifically to each user.
Why is it important? Personalized AI has the potential to significantly enhance the end user experience by delivering content, suggestions, and interactions that are most relevant and meaningful to the individual. It moves beyond one-size-fits-all solutions to create a more intuitive and user-centric approach. None of this is new, it’s been happening in all facets of our digital lives for decades now, but AI that adapts its responses to individual preferences and needs is just beginning to emerge.
Examples and Uses: Imagine a learning app that adapts its teaching methods and content based on the student's learning pace, preferred style and interests, or a health app that tailors fitness and nutrition advice to the user's lifestyle, goals, and health data. Where today only the top 1% may be able to afford a private tutor, a health coach, a private chef, and so on… personalized AI models may soon offer similar levels of personal utility at scale and low cost.
8. Explainability
What is it? Explainability in AI involves the development of models and systems that can provide understandable explanations and citations for their decisions, predictions, and actions. It's about making AI's "thought process" transparent and interpretable to us humans. As we all know well, models have the ability to invent, or “hallucinate,” things all the time, and explainability seeks to solve some of the trust issues that arise from this behavior.
Why is it important? As AI systems become more involved in critical decision-making, from medical diagnoses to legal judgments, the need for transparency and trust in these systems grows. Explainability ensures that AI decisions can be understood, trusted, and, importantly, challenged by humans, which is crucial for a host of ethical and obvious practical reasons.
Examples and Uses: Consider a medical diagnosis AI that not only identifies a disease but also provides the reasoning behind its diagnosis, allowing doctors to understand the AI's judgment. This push towards explainable AI is key to building trust and ensuring that AI technologies are used responsibly and intelligently.