How Computer Vision Works for Self-Driving Cars
I'm writing a series for Backline on How Self-Driving Cars Work! Here's my first post, "How Computer Vision Works for Self-Driving Cars":
Recently I gave a TEDx talk on How Self-Driving Cars Work.
Over the next five weeks, I'll break down in more detail the role each of these functions plays in a self-driving car.
We'll start with computer vision, which is how self-driving cars use cameras to see the world around them.
Computer Vision
Cameras are present on every self-driving car, and often in large numbers. Tesla, for example, equips its cars with "eight surround cameras [that] provide 360 degrees of visibility around the car at up to 250 meters of range."
Cameras are key to a variety of essential tasks: lane finding, road curvature estimation, obstacle detection and classification, traffic sign detection and classification, traffic light detection and classification, and more.
In fact, in the world of autonomous vehicles, computer vision is often referred to as "perception", because cameras are the primary (but not the only) tool that vehicle uses to perceive its environment.
Detection and Classification
Several camera tasks take the form of "detection and classification". Both are necessary in order to understand the environment. Look at how many objects are both detected and classified in this demo video for the YOLO v2 neural network.
The computer has to both find where objects are in a camera image ("detection", or sometimes "localization") and also determine what they are ("classification"). And the computer has to do this fast enough to hand off the results to the rest of the driving system, so other components of the system can use the data to make decisions.
Deep Learning
In the last few years, deep neural networks have emerged as the dominant approach to working with camera video and images. These neural networks learn from data. For example, in order to teach a deep neural network what a stop sign looks like, we feed it thousands of stop sign images, and it gradually "learns".In contrast, more traditional approaches to computer vision focus on color spaces, gradients and edges in the image, regions of interest within the image, and other machine learning techniques to extract intermediate "features" from the image.
Deep neural networks have taken off in large part due to advances in GPU hardware. Graphical processing units (GPUs) are optimized for performing many computations at once, whereas traditional CPUs are optimized for performing one computation as quickly as possible. GPUs are very good at updating all the pixels on your monitor, and they're also very good at updating the layers of artificial neurons that make up deep neural networks.
Advances in both processors and in the architecture of deep neural networks have led hardware companies like NVIDIA to become key players in the autonomous vehicle ecosystem.
Weaknesses
Cameras have many strengths, in particular resolution and cost. Of course, they also have weaknesses. In particular, cameras are not good at estimating the distance, height, and velocity of other objects. Stereo camera systems help in this area, and interesting work is being done on training neural networks to estimate these metrics. That said, radar and lidar remain the dominant sensors for detecting these measurements.
The Pipeline
Computer vision is just the first step in the autonomous vehicle data pipeline. The car incorporates data from its many cameras, identifies important elements of its surrounding environment, and fuses these elements with data from radar and lidar.
This merging of data is the domain of sensor fusion, which we’ll cover next week.
co-founder, dSky.ai - public speaker - startup mentor - angel investor - serial entrepreneur - alpinist / mountaineer - acrobatic coach
1 年THANK YOU! Perfect Primer to a Deep Space. ??
Seeking 2025 Winter + Summer Software Engineering Internships | Prev. @ Give and Go, Micromart
5 年Amazing article. It seriously gave much more depth and clarity on how Computer vision works on AV's?
Applied Scientist II @ Microsoft AI | Turning Ideas?? into Profitable Products?? using AI ??
6 年This article is exactly what I was looking for, very interesting. Thank you for posting.
Tech Professional at Visteon Corporation
6 年Informative Article. Thanks for posting.
Consultant, Retired Director (CTO, CIO)
6 年https://www.technologyreview.com/s/542626/why-self-driving-cars-must-be-programmed-to-kill/ https://moralmachine.mit.edu/