How will AI transcend from the 2D to the 3D world?
Mohamed (Mo) Elshenawy
President and CTO @ Cruise | Tech Founder | Board Member
A huge shift in AI technology is happening right in front of us on everyday city streets.?
The AI that we’ve historically interacted with has remained mostly in a purely digital realm — generating an essay from a prompt, making shopping recommendations based on purchase history, or ranking newsfeed content. The recent Generative AI boom, in particular, has democratized access to AI, as applications like ChatGPT, Bard, and others are now commonplace to hundreds of millions. Industry, consumer, and investor interest in AI has never been higher.???
But this is just the beginning.?
At Cruise, our application of AI leaps out of a computer screen and into the real world. Our AI-powered robotaxis are driving through physical space, making split-second decisions based on a detailed and nuanced understanding of the world they operate in. To successfully and safely bring AI onto urban roads, we’re challenged to build responsible and advanced autonomous technologies.
Even in these early days, autonomous vehicles (AVs) are poised to usher in AI as a transformative force in the physical world.
The future of AI is 3D
The next wave of life-changing AI will come from a total paradigm shift: the move from 2D, digital AI to 3D, real-world AI. While digital outputs like flawed movie recommendations are innocuous, applications in the tangible, 3D realm—like AVs—demand unprecedented precision and reliability. In this domain, even a minute error can have serious implications.
An AV fleet epitomizes the physical world's intricate scenarios, necessitating a sophisticated and robust AI system. AVs must perceive, react, and decide, ensuring safety from point A to point B, refining their capabilities through every experience. AVs must manage the rapid shifts of road friction from wet weather to the nuances of human bystander and passenger interaction.
AVs are the perfect starting point for 3D AI?
The AI powering an AV fleet integrates advanced multimodal sensor fusion, kinematic control algorithms, and real-time physics-based simulations. This amalgamation not only replicates but surpasses human perception. Humans can be inconsistent, show lapses in judgment, get tired, or sometimes neglect traffic laws. AVs operate with unwavering adherence to established parameters, which produces safer outcomes for everyone on the road.
A misstep from ChatGPT might evoke a chuckle, but in the domain of AVs, there's zero margin for error. These systems must anticipate and react in microseconds, discerning the cascading consequences of each decision for passengers and those sharing the road. Ensuring the safe and consistent deployment of such "3D AI" demands a technology stack fortified with layers of redundancy, vigilant validation processes, and intricate safety protocols.
Starting with a safety-critical mindset?
At Cruise, safety is our top priority. From the earliest stages, our safety-critical mindset and framework were at the heart of our system design. Our beginnings and expansion in San Francisco exposed our fleet daily to scenarios with direct application to our model training.?
Our AV architecture combines the best of modern machine learning techniques, which allow us to generalize to new scenarios with traditional nonlinear control theory, which provides rigorous guarantees of constraint satisfaction and safety.
领英推荐
Data security is a critical pillar of our safety-first approach. Recognizing the imperfections of real-world data, we employ techniques like adversarial training to robustify our models against potential adversarial attacks.?
To ensure our models remain versatile, we utilize transfer learning, bootstrapping the knowledge from pre-trained models and fine-tuning for AV-specific scenarios. Beyond just data augmentation, we implement Generative Adversarial Networks (GANs) to synthetically generate high-fidelity data, ensuring our model's exposure to a wide spectrum of driving conditions.?
Additionally, to handle the dynamism of real-world scenarios, our system integrates reinforcement learning strategies, enabling the AVs to refine decision-making policies continuously, optimizing for both safety and efficiency. The chaotic streets of San Francisco have provided the optimal framework for training our fleet to achieve the complex, multi-layered decisions required for a safe and enjoyable ride.
To monitor and verify our model's robustness and reliability, we employ model explainability tools and continuous validation frameworks. These strategies validate our fleet's decision pathways and iteratively refine the models in response to emergent patterns and edge cases. This strategy is critical to applying our models to new fleet testing and the progressing service expansion to new city environments.
Positioning a stack to scale?
We chose the most complex environment first to hone our AI stack: San Francisco’s dense, chaotic roads. From the start, we aimed to expose our fleet to an onslaught of intricate street scenes and unique traffic scenarios that can only be found on crowded urban roads. In opting to solve for the “hardest problem first”, we’ve paved the way to swiftly scale operations to new markets and continuously improve upon our current performance. Critical to this is a system of testing, rapid iteration, and application to the environment (always with a focus on safety).?
This generalized model approach has enabled us to expand to other cities quickly (did you see that we recently began testing in Nashville, Miami, Atlanta, and Los Angeles?). We utilize several techniques to ensure that our models are generalized for many unique situations, meaning that we don’t have to build a new AV stack or make other substantial changes for the AVs to work in other cities, like Phoenix or Austin.?
What’s next? AI will further integrate into the physical world?
The lessons learned from our safety-first approach provide a framework for AI’s deeper integration into the physical world. Driving, a time- and labor-intensive everyday activity, is a prime candidate for AI transformation. Beyond this early application, imagine the advancements in agriculture, manufacturing, logistics, the supply chain, healthcare, search and rescue, and new ways of machine and human interaction that AI can usher in. We are identifying and solving many big hurdles of AI application to a 3D environment.?
AVs are an ideal starting point for expanding AI to other industries, and we invite you to take this ride with us (mind you, it will be driverless).?