Tesla's Data Engine and what we should all learn from it
Andrej Karpathy is one of the few people in the AI world who has a deep understanding of the challenges of developing and maintaining a scalable and reliable AI solution for one of the most complex real world problem - autonomous driving. Tim Cook has been quoted saying that autonomous driving is "We sort of see it [self driving cars] as the mother of all AI projects”.
At Tesla Andrej is pushing the envelope of deep learning by combining multi-task learning and a massive "data engine" to collect rare examples which are the essence of addressing the long tail problem.
Few days ago, in this talk at the CVPR Workshop of Scalability in Autonomous Driving, he summarizes the challenges Tesla and his team faces and how they are tackling them. Most of what he discusses was already covered in previous talks, but if you are into autonomous driving and AI I strongly suggest you to watch it.
There is loads to be learnt from Tesla's work in autonomous driving and AI in general, I will break down the points for you:
The first 2 points are mainly related to autonomous driving, and might therefore be less relevant to your domain if you are not working in it, yet any company striving to benefit from adopting AI should strive to replicate Tesla's approach for point 3-5. I'll get back to this at the end of this post.
Relying on HD maps and LIDAR is not scalable
There are several reasons why exploiting high resolution maps and LIDAR is not scalable. From an algorithmic perspective, having access to a precise 3D point cloud of the environment which has been scanned in advance and LIDAR on the vehicle aiming to drive autonomously allows to localize a vehicle with a centimetre accuracy. That might sound like a solid approach, but what happens when the road configuration has changed between the time the scan was done and the car is driving at the location? This would require re-scanning each road periodically.
Furthermore, localization is only one of the challenges. From perception point of view, recognizing other vehicles, pedestrians, and all other long tail situations (such as a flying chair lost from a truck) would in any case have to be addressed by analyzing images. Thus, starting from LIDAR only postpones tackling the bigger challenge.
The real world is complex, diverse, evolving and long tailed
Full autonomous driving requires a long series of tasks, among which: accurately and reliably detect the road and road markings, establish the position of the vehicle on the road, detect other vehicles, pedestrians and any other object on the road, and, last but not least, detect traffic signs.
As an example, when thinking of detecting speed limits and stops signs, if your background is in machine learning your first intuition might be that modern Deep Neural Network should be able to easily tackle the challenge. After all, traffic signs are rigid planar objects with convex shapes, no holes, standard shapes and designed to be high contrast and easily recognizable. This sounds like one of the easiest object detection tasks to solve.
Not so fast. The reality is as usual much harder. Two are the main challenges:
Here are four examples of the extremely long tail of cases fully autonomous driving vehicles would have to cope with: a chair flying off a pickup truck a dog running next to a car, a completely mirrored truck and cones laying on the street - which Andrej mentioned were recognized as red traffic lights.
Tesla summarized this in the following slide at Autonomy Day in April 2019.
Operation Vacation: Tesla's AI approach
In what Andrej defined already for some time "Operation Vacation", he pushes his engineering team to focus on setting up the generic AI infrastructure to efficiently collect data, label it, train and reliably test models, so that the task of updating models to detect new objects can be handled by a separate product managers and labeling team. This keeps the AI team at Tesla nimble and efficient - and jokingly at some point the team could be on vacation and the system would improve without any more of the effort.
One of the fundamental requirements for this approach to work is the concept of data unit tests for the machine learning models: a set of examples on which models previously failed which need to be successfully passed. Performance on unit tests can never regress, only improvements are accepted for a new model to be released into production.
Tesla Data Engine
At the core of Operation Vacation is what Andrej calls Data Engine, shown below again from Andrej's presentation at Tesla Autonomy Day.
领英推荐
The goal of the Data Engine is to ensure data can be collected in the most efficient manner in order to cover the extremely long tail of examples required for models to reliably perform in real unconstrained world. The core principle of the data engine is very simple:
We discussed about data unit test above, step 6 and 7 are equally important. Given the huge number of miles driven each day by Tesla vehicles - more on that in a second - how can the Data Engine ensure the labeling team won't be overwhelmed by false positives? Andrej mentions a few approaches on this talk, also admitting that no method works perfectly: flickering detection in the deployed model, neural network uncertainty from a Bayesian perspective, sudden appearance of a detection, discrepancy with an expected detection given map information.
Another approach which Tesla has been using to query potentially relevant examples is investigating all the autopilot disengagements: each time a Tesla driver whose vehicle is in autopilot mode decides to disengage autopilot, the likelihood of low performance in the model is high. The data engine can be used to fetch the most relevant examples out of all those cases to, allowing the labeling team to focus on the most critical improvements.
Below is an example of the type of data collected by the Data Engine after requesting to retrieve more stop signs obstructed by foliage.
To confirm the relevance of this approach to Tesla, Karphathy filed a patent application on this very subject.
Tesla's data advantage
The principle at the core of the Data Engine is not unique to Tesla: it is inspired by Active Learning and has been an hot research topic for years. The competitive advantage Tesla has is the unmatched scale of data collection.
Here is estimation of Tesla Autopilot miles from Lex Fridman, which shows Tesla has collected more than 3 Billion miles in autopilot. As a comparison, Google's Waymo recently announced it has collected 20 Million miles since its inception in 2009. Tesla is currently leading by at least a factor 100.
Not only Tesla's current lead in amount of data it has collected is huge, the lead is likely going to expand at faster rate. The reason? As George Hotz from Comma.ai very clearly discussed during this Tesla Third Row interview:
if you want to add a new car to your network how much money does this cost? It costs Waymo more like you know 400.000$, it costs Comma negative 1.000$ and it costs Tesla negative 10.000$
- George Hotz, Comma.ai -
Indeed Waymo's vehicles are very expensive due to the complex set of LIDAR sensors, and even more so, Waymo has to pay engineers to drive a vehicle as it is not yet allowed to let the vehicles drive without human supervision. Tesla on the other hand makes around 20% gross profit from each vehicles it sells, and consumers collect miles without Tesla having to pay them (obviously there are costs involved in the infrastructure Tesla needs to maintain to store, label and process data, but those are similar for any company working on autonomous driving).
Is this it?
So, is this all Tesla is doing in AI? Obviously not. There are lots of additional angles in which Tesla is pushing the current state of the art:
What can we all learn from it?
As promised at the beginning of the post, let's now look at what can be learnt from Tesla's approach to tackling autonomous driving. In particular I will look at the question from the angle of any company striving to be successful in applying AI.
the only sure certain way I have seen of making progress on any task is, you curate the dataset that is clean and varied and you grow it and you pay the labeling cost and I know that works
Feel free to reach out if you are interested in this approach and would like to share experiences in trying to replicate it.
Product Owner | Business Analyst | Event Tech Specialist
4 年Cool article, Tommaso! I am expecially interested in what you said about creating a solid Data Engine. I look forward to discussing it with you sometime.
Head of AI at LUMICKS
4 年A great example of iteration in performance, have a look at this video: https://youtu.be/BcY8iCQf924. Not only performance improved vastly in only 2 weeks, but in the presentation of Andrej Karpathy a video from Dirty Tesla was actually shown, demonstrating how they fetch the data of that edge case (traffic lights for different lanes than the one in which the car is driving previously mistakenly making the car stop). Given the fact that this YouTube channel is know to Tesla, I would not be surprised if the fact that they worked on that video was actually because someone at Tesla watched the previous video in which he was reviewing the traffic lights and stop sign detection.
Senior Machine Learning Engineer at TMNL
4 年Nice piece Tom, that's a huge argument in favor of a data engine and live data / model ecosystem
Senior Perception Engineer @ Scythe Robotics
4 年Great read! There is not much talk in the ML community about building data engines for model development. This is likely due to the massive engineering effort involved, but its role in building a production ready, reliable ML system cannot be overlooked. One thing mentioned is the use of active learning in the data engine. I would say the approach Tesla is taking is closer to "guided learning" because they are using these so called unit tests designed with desired model behavior in mind to determine which examples to source. This has been shown to be a much more effective approach in highly unbalanced data regimes which as we all know happens often in real world scenarios. Check out this paper for more on guided learning: https://pages.stern.nyu.edu/~fprovost/Papers/guidedlearning-kdd2010.pdf?
This was super interesting and well-written Tommaso Gritti !