How Computer Vision Works for Self-Driving Cars

How Computer Vision Works for Self-Driving Cars

I'm writing a series for Backline on How Self-Driving Cars Work! Here's my first post, "How Computer Vision Works for Self-Driving Cars":

Recently I gave a TEDx talk on How Self-Driving Cars Work.

Over the next five weeks, I'll break down in more detail the role each of these functions plays in a self-driving car.

We'll start with computer vision, which is how self-driving cars use cameras to see the world around them.

Computer Vision

Cameras are present on every self-driving car, and often in large numbers. Tesla, for example, equips its cars with "eight surround cameras [that] provide 360 degrees of visibility around the car at up to 250 meters of range."


Cameras are key to a variety of essential tasks: lane finding, road curvature estimation, obstacle detection and classification, traffic sign detection and classification, traffic light detection and classification, and more.

In fact, in the world of autonomous vehicles, computer vision is often referred to as "perception", because cameras are the primary (but not the only) tool that vehicle uses to perceive its environment.

Detection and Classification

Several camera tasks take the form of "detection and classification". Both are necessary in order to understand the environment. Look at how many objects are both detected and classified in this demo video for the YOLO v2 neural network.

The computer has to both find where objects are in a camera image ("detection", or sometimes "localization") and also determine what they are ("classification"). And the computer has to do this fast enough to hand off the results to the rest of the driving system, so other components of the system can use the data to make decisions.

Deep Learning

In the last few years, deep neural networks have emerged as the dominant approach to working with camera video and images. These neural networks learn from data. For example, in order to teach a deep neural network what a stop sign looks like, we feed it thousands of stop sign images, and it gradually "learns".In contrast, more traditional approaches to computer vision focus on color spaces, gradients and edges in the image, regions of interest within the image, and other machine learning techniques to extract intermediate "features" from the image.

Deep neural networks have taken off in large part due to advances in GPU hardware. Graphical processing units (GPUs) are optimized for performing many computations at once, whereas traditional CPUs are optimized for performing one computation as quickly as possible. GPUs are very good at updating all the pixels on your monitor, and they're also very good at updating the layers of artificial neurons that make up deep neural networks.

Advances in both processors and in the architecture of deep neural networks have led hardware companies like NVIDIA to become key players in the autonomous vehicle ecosystem.

Weaknesses

Cameras have many strengths, in particular resolution and cost. Of course, they also have weaknesses. In particular, cameras are not good at estimating the distance, height, and velocity of other objects. Stereo camera systems help in this area, and interesting work is being done on training neural networks to estimate these metrics. That said, radar and lidar remain the dominant sensors for detecting these measurements.

The Pipeline

Computer vision is just the first step in the autonomous vehicle data pipeline. The car incorporates data from its many cameras, identifies important elements of its surrounding environment, and fuses these elements with data from radar and lidar.

This merging of data is the domain of sensor fusion, which we’ll cover next week.


Gregory Roberts

co-founder, dSky.ai - public speaker - startup mentor - angel investor - serial entrepreneur - alpinist / mountaineer - acrobatic coach

1 年

THANK YOU! Perfect Primer to a Deep Space. ??

回复
Safwan Khan

Seeking 2025 Winter + Summer Software Engineering Internships | Prev. @ Give and Go, Micromart

5 年

Amazing article. It seriously gave much more depth and clarity on how Computer vision works on AV's?

回复
Mosaab Muhammad

Applied Scientist II @ Microsoft AI | Turning Ideas?? into Profitable Products?? using AI ??

6 年

This article is exactly what I was looking for, very interesting. Thank you for posting.

回复
Vijayakumar Subramanian

Tech Professional at Visteon Corporation

6 年

Informative Article. Thanks for posting.

回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了