Entering in the era of open source AI robotic

Entering in the era of open source AI robotic

Last weekend (Oct. 26-27), I had the pleasure of participating in the first open-source AI robotics hackathon organized by Hugging Face —and even received a certificate (see banner)!

The idea and objectives of this hackathon are simple: to bring together and expand the community via a hands-on workshop, from robotic arm construction to using AI to accomplish tasks within the weekend.


My teammate Julian Dierkes (left) and I (right)

With this post, I’m sharing some content (pictures and videos) and briefly explaining each step, along with useful links to help you feel the passion or even get started.

First, a note: I’ll be discussing AI robotics — not just robotics — as the algorithms driving robot movement are increasingly AI-driven (e.g., reinforcement learning techniques).

Context: open source AI robotic kickoff

This year, Hugging Face , with the initiative of its founder and Chief Scientist Officer Thomas Wolf and the arrival of Remi Cadene (ex. Tesla ) ), launched an open-source project named LeRobot (with a french touch), aiming to build a large open-source AI robotics community.

In parallel to this python package project, robots are firstly hardware and cost at least thousands of euros to enter the field. To address this, they developed affordable robots (~€200), with parts easily ordered online and minimal 3D printing needs. During the hackathon, they provided fifty of these kits to allow participants to train their own models.

Phase 1 - Robot arm construction

We now have all the pieces to build the robots and a tutorial depending on the arm (moss and so100). We have to build two arms: one that will make the task (called follower) and the other one to teleoperate for the AI training (called leader). Screws are tiny, package immature. Yet, by day’s end, assembly was achieved, with each arm powered by six motors.


Full so100 arm kit (you will see the Moss arm for the rest of the post)


Working on the 12 motors, board on the bottom left, ...


Phase 2 - Configuration

Following assembly, the arms were connected to computers for configuration, consisting of three main steps: motor identification, calibration, and camera setup.

  1. Motor identification: This step was essential since all motor IDs initially arrived set to 1. IDs were updated for each motor (from 1 to 6) on both arms to ensure correct detection.
  2. Calibration: calibration involved setting arm angles in 3 main positions: zero (all angles at zero), rotation (angles at 90°), and rest. Although this step is currently manual, automatic calibration is expected soon/quasi in place.
  3. Camera(s) setup: this is where the magic of AI began. A camera was used to stream live images, allowing the AI model to predict the robot’s next moves at 30 frames per second (every 30 milliseconds). No special camera is needed; laptops or phones can be used, and using two cameras improves depth estimation.


Picture credit to Hugging Face (LeRobot github repo)

Phase 3 - Datasets recording for a specific task

Now comes the funniest part: the dataset recording composed of ~50 episodes (or more). Before making the teleoperation, the task should now be defined. One can start with a simple task (putting an object in a fixed box) and complexify it afterwards (you will see ours later on).

The environment is set up for each task, and the follower arm is teleoperated via the leader arm to complete it. Repeating this task 50 times in varied situations built a robust dataset, capturing camera images and the positions of the 6 motors on the arm..

Important note: the camera should be fixed during the whole process, the training is not robust to any move of those.

Phase 4 - Training and inference using imitation learning

The training is quite straight forward using a GPU. It takes only 1 hour on a H100 GPU, or approximately eight hours on a local GPU (ex. RTX 4050 laptop GPU). Yet, you can already play with prior checkpoints, it learns quite quickly.

For inference, the environment should be setup as for training, but now, the follower is connected autonomously to the trained AI model to achieve the task. This was the “moment of truth” that determines if the dataset covered enough scenarios for consistent performance.

Kudos to us, we succeeded to perform two tasks: 1. putting an orange block (position can vary) in a fixed box and 2. stacking three boxes (way more complex). These experiments have been done with only one camera even if it is better to do it with two. Again, start small and complexify.

Phase 4bis - Training exercise using reinforcement learning

A promising approach based on reinforcement learning, called TD-MPC, is also implemented in a fork by Alexander Soare (here). The environment has a reward mask so that the robot can learn via free exploration to achieve the task. The teleoperate helps during the exploration to avoid impossible situations by placing the object back in a good position when needed.

Conclusion and perspectives

The possibilities in open source AI robotic are tremendous. The closed AI robotics is also progressing fast. Elon Musk shared two weeks ago his vision with the presentation of Optimus Gen 2 that could become our personal assistant within a few years. Let the competition begin.

Hope you have enjoyed this post, feel free to join the Discord community and start exploring the AI robotics era with us.

Please reach out if you have any question or comments.

Best,

Yannick Léo - Partner & AI Director at Emerton


Olivier HOBERDON

Directeur des Systèmes d'Information Bouygues SA | DSI engagé

4 个月

So impressive !

回复
Claire Bisiaux

Chief People Officer at EMERTON

4 个月

No robot could ever match your insatiable curiosity Yannick Léo !! Lucky #EmertonData

Gilles Debry

Managing Partner - EMERTON Leadership

4 个月

Bravo à Yannick et toute l'équipe

Ibrahim Merad

Data & AI scientist at Emerton / Kaukana Ventures Lead AI at Celerity

4 个月

Looks really fun and exciting, great job !

Wow, what an exciting journey! The future of open-source AI robotics looks promising with affordable robots and impressive results in imitation learning. Kudos to you and your team for the amazing work! Yannick Léo

要查看或添加评论,请登录

Yannick Léo的更多文章

社区洞察

其他会员也浏览了