Machine Learning Fundamentals for Self-Driving Cars
Credit: Ricard Zuccolo, Vimeo

Machine Learning Fundamentals for Self-Driving Cars

Sebastian Thrun, who in addition to being my boss has as good a claim as anyone to being the father of the self-driving car, likes to say that perception is 80% of the challenge of building self-driving cars.

Fortunately, perception accuracy and speed has increased dramatically in recent years, largely thanks to deep learning. Deep neural networks (which are synonymous with "deep learning") have transformed our ability to work with camera data, and they have the potential to transform work on other parts of the self-driving car stack, as well.

Over the coming weeks, I'll be covering different aspects of deep learning. Today I'll start with the fundamentals of machine learning. In coming weeks, I'll write about deep neural networks, convolutional neural networks, deep learning frameworks, transfer learning, reinforcement learning, and possibly a few more topics.

As with my previous Back\Line posts, these will be high-level and conceptual. If you are interested in learning how to actually build deep neural networks, I might modestly suggest Udacity's Self-Driving Car Engineer Nanodegree Program, or Udacity's School of Artificial Intelligence.

And please subscribe to Back\Line to keep up with the posts there!

Taxonomy

Deep learning is a type of machine learning, which is, in turn, a type of artificial intelligence.

Artificial Intelligence

Starting at the top, artificial intelligence describes "agents" that follow the perception-action cycle. These agents are often computer algorithms at their core. They perceive the environment around them, often plan how to best reach their goals, and then act to achieve those goals.

For example, imagine an agent whose goal is to determine whether an image is of a stop sign, or not. The agent might follow a simple algorithm: if the image contains a red background with white text, classify that image as a stop sign. That's not the most sophisticated algorithm, and it won't be right 100% of the time. But it's an agent that is following the perception-action cycle, and thus it demonstrates artificial intelligence.

Machine Learning

Within the broad umbrella of artificial intelligence lies machine learning. Machine learning is a class of algorithms that learn from data to achieve artificial intelligence.

Let's revisit our stop sign classification agent. Imagine, instead of looking for a red background with white text, it instead learns from a giant collection of images. Some of those images are of stop signs, and some aren't, and over time the agent just learns what a stop sign looks like.

This is a little bit like how a human learns to perceive the environment. We see lots of things and we build up an intuition over time. Notice that we haven't specified how the agent learns to distinguish stop signs from other images - there are many different algorithms it could be using for that, all of which are "machine learning".

Deep Learning

Deep learning is a type of machine learning that uses a specific tool, called a neural network, to learn from data.

Neural networks contain layers of "artificial neurons", each of which is connected to other artificial neurons in other layers of the network. Each neuron takes input from part of the network, performs its own calculations, and passes those results on to other parts of the network.

Neural networks (sometimes called deep neural networks) are just one of many approaches to machine learning, but they've become critically important in the last six years. They work very well on modern parallel computing chips, especially graphical processing units (GPUs). GPUs were originally designed to output images to computer monitors.

Roughly speaking, if you think about pixels on a screen, they're all doing roughly the same thing at the same time, just with slightly different values. Similarly, those layers of artificial neurons in a neural network are all doing roughly the same thing at the same time, just with slightly different values. One of the happy coincidences of engineering :-)

Goals

Machine learning agents have different types of outputs. There are four types of outputs that are particularly important for self-driving cars: regression, classification, localization, and segmentation.

Regression

The canonical output for a machine learning agent is a number. This number might be how far away a pedestrian is, or how hard to press the accelerator on a car, or it might be the coefficient for a third-order polynomial that describes a lane line on the road.

Classification

Some machine learning agents output discrete classes, instead of continuous numbers. For example, an agent might classify whether a traffic sign is a stop sign, yield sign, speed limit sign, or any one of tens or hundreds of other types of street signs.

Localization

A different type of network might localize objects within an image. For example, before a network can classify what type of traffic sign appears in an image, it must first identify where in the image that traffic sign is, whether a traffic sign appears at all, and whether there are multiple traffic signs in an image.

Localization agents typically output the coordinates of objects within in an image, so in some ways they resemble regression networks. But their purposes is sufficiently unique to think about them as their own class of agents.

Segmentation

Segmentation agents classify individual pixels within an image. Some pixels represent the road, others represent vehicles, others represent pedestrians, other free space, others the sky, and so forth. Classifying all of the pixels within an image helps us to understand where the free space is in the environment for us to drive.

Just like localization agents are a specialized type of regression agent, segmentation agents are a specialized type of classification agent. Instead of outputting a single class for the whole image, each pixel in the image gets its own class.

Coming Up

I've only just scratched the surface of machine learning here. Next week I'll return to machine learning fundamentals and describe the process of training, validating, and testing a machine learning agent. After that, we'll move on to deep neural networks, and look at how they've revolutionized self-driving cars.

Subscribe to Self-Driving Cars on Back\Line to follow along, and check back in next week!

Gamal Bohouta, Ph.D

Head of Information Systems Department, CTO at Smart City

6 年

Great work

回复
Urvish Patel

Machine Learning Engineer | Kaggle Competitions Master

6 年

Great work David Silver. Looking forward to whole series

回复

要查看或添加评论,请登录

David Silver的更多文章

  • Hello Kodiak!

    Hello Kodiak!

    I joined Kodiak to work on self-driving trucks! I’m so excited to be here Last week was my start, and I’m already…

    36 条评论
  • My First Driverless Ride!

    My First Driverless Ride!

    On Monday evening, I met my manager, Jason, in San Francisco, to take my first-ever driverless ride. We hailed Torta…

    11 条评论
  • Hot Job Tuesday: Compute Hardware Tech Lead at Cruise

    Hot Job Tuesday: Compute Hardware Tech Lead at Cruise

    Cruise is recruiting a hands-on engineering leader for the computational hardware team. You can apply directly through…

    1 条评论
  • Hot Job: Learn Rust At Q Bio

    Hot Job: Learn Rust At Q Bio

    My friend and former Udacity boss, Clarissa Shen, leads Q Bio, which built a next-generation medical imaging system…

  • Come Work At Voyage: Engineering Manager, Perception

    Come Work At Voyage: Engineering Manager, Perception

    Voyage is hiring an Engineering Manager for the Perception Team! Here is the job description. If you want this job:…

    1 条评论
  • Udacity’s Sensor Fusion Nanodegree Program!

    Udacity’s Sensor Fusion Nanodegree Program!

    Udacity’s Sensor Fusion Nanodegree Program launched yesterday! I am so happy to get this one out to students ?? Goal…

    13 条评论
  • The C++ Nanodegree Program!

    The C++ Nanodegree Program!

    I am super excited that today Udacity launched the C++ Nanodegree Program! My team and I have been building this for…

    7 条评论
  • Perception Projects from the Self-Driving Car Nanodegree Program

    Perception Projects from the Self-Driving Car Nanodegree Program

    Students always tell us that perception, deep learning, and computer vision are some of their favorite topics in the…

    2 条评论
  • Self-Driving Cars and the Future of Real Estate

    Self-Driving Cars and the Future of Real Estate

    This is the final post in a five-week series about how self-driving cars will change the future of mobility, retail…

  • Self-Driving Cars and the Future of Retail

    Self-Driving Cars and the Future of Retail

    Self-driving cars will lower the costs of delivery and accelerate the trend away from brick-and-mortar retail. This is…

    10 条评论

社区洞察

其他会员也浏览了