Entering in the era of open source AI robotic
Yannick Léo
Partner - AI, Generative AI & Robotics at Emerton Data | Bridging Research & Business | Ph.D. in Computer Science
Last weekend (Oct. 26-27), I had the pleasure of participating in the first open-source AI robotics hackathon organized by Hugging Face —and even received a certificate (see banner)!
The idea and objectives of this hackathon are simple: to bring together and expand the community via a hands-on workshop, from robotic arm construction to using AI to accomplish tasks within the weekend.
With this post, I’m sharing some content (pictures and videos) and briefly explaining each step, along with useful links to help you feel the passion or even get started.
First, a note: I’ll be discussing AI robotics — not just robotics — as the algorithms driving robot movement are increasingly AI-driven (e.g., reinforcement learning techniques).
Context: open source AI robotic kickoff
This year, Hugging Face , with the initiative of its founder and Chief Scientist Officer Thomas Wolf and the arrival of Remi Cadene (ex. Tesla ) ), launched an open-source project named LeRobot (with a french touch), aiming to build a large open-source AI robotics community.
In parallel to this python package project, robots are firstly hardware and cost at least thousands of euros to enter the field. To address this, they developed affordable robots (~€200), with parts easily ordered online and minimal 3D printing needs. During the hackathon, they provided fifty of these kits to allow participants to train their own models.
Phase 1 - Robot arm construction
We now have all the pieces to build the robots and a tutorial depending on the arm (moss and so100). We have to build two arms: one that will make the task (called follower) and the other one to teleoperate for the AI training (called leader). Screws are tiny, package immature. Yet, by day’s end, assembly was achieved, with each arm powered by six motors.
Phase 2 - Configuration
Following assembly, the arms were connected to computers for configuration, consisting of three main steps: motor identification, calibration, and camera setup.
领英推荐
Phase 3 - Datasets recording for a specific task
Now comes the funniest part: the dataset recording composed of ~50 episodes (or more). Before making the teleoperation, the task should now be defined. One can start with a simple task (putting an object in a fixed box) and complexify it afterwards (you will see ours later on).
The environment is set up for each task, and the follower arm is teleoperated via the leader arm to complete it. Repeating this task 50 times in varied situations built a robust dataset, capturing camera images and the positions of the 6 motors on the arm..
Important note: the camera should be fixed during the whole process, the training is not robust to any move of those.
Phase 4 - Training and inference using imitation learning
The training is quite straight forward using a GPU. It takes only 1 hour on a H100 GPU, or approximately eight hours on a local GPU (ex. RTX 4050 laptop GPU). Yet, you can already play with prior checkpoints, it learns quite quickly.
For inference, the environment should be setup as for training, but now, the follower is connected autonomously to the trained AI model to achieve the task. This was the “moment of truth” that determines if the dataset covered enough scenarios for consistent performance.
Kudos to us, we succeeded to perform two tasks: 1. putting an orange block (position can vary) in a fixed box and 2. stacking three boxes (way more complex). These experiments have been done with only one camera even if it is better to do it with two. Again, start small and complexify.
Phase 4bis - Training exercise using reinforcement learning
A promising approach based on reinforcement learning, called TD-MPC, is also implemented in a fork by Alexander Soare (here). The environment has a reward mask so that the robot can learn via free exploration to achieve the task. The teleoperate helps during the exploration to avoid impossible situations by placing the object back in a good position when needed.
Conclusion and perspectives
The possibilities in open source AI robotic are tremendous. The closed AI robotics is also progressing fast. Elon Musk shared two weeks ago his vision with the presentation of Optimus Gen 2 that could become our personal assistant within a few years. Let the competition begin.
Hope you have enjoyed this post, feel free to join the Discord community and start exploring the AI robotics era with us.
Please reach out if you have any question or comments.
Best,
Yannick Léo - Partner & AI Director at Emerton
Directeur des Systèmes d'Information Bouygues SA | DSI engagé
4 个月So impressive !
Chief People Officer at EMERTON
4 个月No robot could ever match your insatiable curiosity Yannick Léo !! Lucky #EmertonData
Managing Partner - EMERTON Leadership
4 个月Bravo à Yannick et toute l'équipe
Data & AI scientist at Emerton / Kaukana Ventures Lead AI at Celerity
4 个月Looks really fun and exciting, great job !
Wow, what an exciting journey! The future of open-source AI robotics looks promising with affordable robots and impressive results in imitation learning. Kudos to you and your team for the amazing work! Yannick Léo