NVIDIA Robotics Ecosystem Innovations during CES 2024
Gerard Andrews
Strategic Alliances | Product Marketing Director | AI Evangelist | Angel Investor
During CES 2024 last week, several robotics companies shared how they’re using tools and software by NVIDIA to apply AI to their robotic products, including vision AI, large language models (LLMs), vision language models (VLMs), and vision transformers (ViTs).
At one end of the spectrum, the robotics companies using LLMs for understanding text or voice commands are applying this technique for Human-Robot Interaction (HRI) capabilities where general image data, paired with text data, is used to help a robot learn a foundational semantic understanding of the visual world. HRI is essential for bridging the communication gap between robots and humans, especially when a robot faces uncertainty or missing information. Another example of AI applied in robotics includes using ViTs for neural robot navigation models in mobile delivery robots. These models include a vision transformer for learning to see and a policy transformer for learning to move.
Why is AI in robotics important?
AI, and the latest advancements in Generative AI, is a true paradigm shift in the robotics industry. Integrating robots into a production environment involves talking to robots, providing feedback, giving guidance, and asking the robot to complete a task. To achieve this, the robot needs to:
Scene comprehension by a robotic agent is a critical function reliant on processing sensor data to collect and interpret data for valuable insights and decision-making. Many robot companies use NVIDIA Jetson to process the onboard sensor data quickly and execute the appropriate robot behavior on the edge. For example, some sensor companies use Jetson to apply proprietary algorithms to enhance sensor performance, optimize for scene versatility, or synchronize multiple cameras for robot perception. Developers interested in accelerating AMR subsystem development can quickly prototype with the NVIDIA Isaac and Jetson platforms by leveraging the Nova Carter developer reference platform. Built in collaboration with Segway Robotics, it is equipped with both stereo cameras and LiDAR.?
领英推荐
While applying a vision AI model helps solve a discrete problem for robots, a robot's ability to behave across increasingly complex scenarios requires more widely applied training and testing using a simulation environment. For this process, development often starts by modeling the desired environment with NVIDIA Isaac Sim, an application built on NVIDIA Omniverse. To start, Isaac Sim's photorealistic graphics for synthetic data generation (SDG) serves as the image dataset for training vision AI. Isaac Sim also includes a PhysX physics engine and ray tracing tools for high-fidelity LiDAR simulation for training and testing a robot's perception and planning entirely in simulation. Building a simulation environment requires virtual world models and other 3D assets, many of which are becoming freely available, such as the Omniverse Warehouse Builder extension. Using NVIDIA Picasso, customized technical art needed for robot simulation can be generated in minutes instead of days, thereby widening the scope of scenarios a robotics company can train and test for in a shorter time.?
AI for robotics applications requires two computers: training in the cloud and the robot's runtime environment. The first computer, called an "AI Factory," is central to creating and continuously improving AI models. AI factories can use NVIDIA's data center compute infrastructure and its AI and NVIDIA Omniverse platforms to simulate and train AI models. The second computer represents the robot's runtime environment, which varies depending on the application. It could run in the cloud or data center, in an on-premises server for tasks like defect inspection in semiconductor manufacturing, or within an autonomous machine equipped with multiple sensors and cameras using NVIDIA Jetson.
?
AI maturation is on its way as robot companies begin to leverage foundation models, or a version of the world model, where neural networks are pre-trained on massive amounts of data without specific use cases in mind. When a model is trained using a broad set of tasks, better performance on downstream tasks can be achieved due to the model’s ability to generalize. Some robot developers achieve this by collecting data with onsite robots to create a Reinforcement Learning (RL) Factory. Powered by the advent of GPT, instead of having separate models for particular tasks such as NLP and image recognition, GPT-based AI can perform various tasks using a single foundation models. AI in robotics achieves a continuous improvement loop when quickly processing and collecting real-world data at the edge, can be trained with additional simulation data, and uses an "AI Factory" with accelerated computing capabilities to escalate a robot's performance of specific tasks needed in a workspace, such as object identification, 3D perception, grasp prediction, and place prediction.
If you missed the robotics portion of the NVIDIA special address during CES week, check out what we announced here.
We are trying with such a simple and cost-effective construction: https://www.youtube.com/watch?v=7tM2LuPhhu8 What do you think?
Wow, CES sounds like it was a blast with all that AI and Robotics mash-up! By the way, when we need top-notch sales folks who get tech like no one else, we head straight to CloudTask. They've got a cool spot where you can pick vetted sales pros after checking out their vids. Might be worth a peek for your team! Here ya go: https://cloudtask.grsm.io/top-sales-talent
Strategic Investing & Delivering the Top 1% SW Engineering Talent
1 年Great meeting you at CES, Gerard!
It's thrilling to see the pace of innovation at CES, especially within AI and Robotics, Mike! ?? Generative AI can streamline the synthesis of such vast information, ensuring you capture every groundbreaking moment with ease and precision. ?? Let's explore how generative AI can elevate your content and keep your audience informed with cutting-edge insights. ?? Book a call with us to unlock the full potential of generative AI for your reporting and content creation needs. ?? Christine