登录查看更多内容

Deep learning--CNN: Edge detection

Chen Yang??????

Machine & Deep Learning | Big Data Cloud

发布日期: 2018年3月11日

Computer vision is one of the areas advancing rapidly thanks to deep learning. Deep learning is now helping the self-driving cars to figure out where are the others cars and pedestrians around it so it avoids them. It's making face recognition much better than ever before, you could unlock a phone or unlock even a door using just your face. The convolutional neural network (CNN) is the deep learning technique that actually pushes forward computer vision performance.

The very first step for CNN processing an image is to learn low-level features of the image such as edge. In this article we talk about the edge detection. Let's say there is a 6x6 grayscale image, in order to detect edges, let's say vertical edge in the image, what we can do is to construct a 3x3 matrix which is called filter or kernel, and then convolve the 6x6 image with the 3x3 filter to obtain a 4x4 image as the output. The upper left element of this 4x4 matrix is computed by moving the 3x3 filter on top of the upper left 3x3 region and then taking the element-wise product and adding up all the resulting 9 numbers, to get the second element in the first row of the 4x4 matrix, you're gonna shift the filter one step to the right and add up all the element-wise products, and repeat shift and sum of products operation you can compute the first raw of 4x4 matrix, when it comes to the second row in the 4x4 matrix, you have to move the filter one step down and do the element-wise product and sum up the products.

So, you shift right and compute next elements in a row and shift down and start to compute the first element in the next row, and so on, finally you obtain the output 4x4 matrix.

To illustrate edge detection using convolution operation, we're gonna use a simplified image, let's say we have a 6x6 image where the left half is all 10 and right half is all 0, if you thought it's a picture then the left half 10s give you a brighter pixel-intensive values and right half 0s give you darker pixel-intensive values, as shown in the following slide, here uses the shade of gray to denote 0s, and there is clearly a very strong vertical edge right down the middle of the image as a transitions from white to gray. And here we have a 3x3 filter which is visualized as a picture where lighter pixels are on the left, middle zeros in the middle and darker on the right. Then you convolve the image with the filter and then get a 4x4 output matrix, now if you plot this output as an image where the lighter region is in the middle and that corresponds to having detected the vertical edge down the middle in the input image. And in case the dimensions here seemed a little bit wrong since the detected edge in the output seemed really thick, that is only because we're working with very small images in this example, and if you're using a say 1000x1000 image then you'll find that this does a pretty good job.

One intuition to take away from vertical edge detection is that a vertical edge is a 3x3 region since we're using a 3x3 filter and there're bright pixels on the left and you don't care that much in the middle and dark pixels on the right. And the middle of the input image is really where bright pixels on the left and dark pixels on the right that is why it thinks as a vertical edge right down the middle, and convolution gives you a convenient way to specify how to find these vertical edges in the image.

Now, if the input image is flipped where the darker on the left and brighter on the right, and the shade of transition is now actually reversed, the minus 30s in the output shows the vertical edge as a transition from dark to light rather than light to dark.

To find the horizontal edge, you're using 3x3 filter where bright on the top and dark at the bottom. Let's say you have a 6x6 input image where it looks like a checkboard pattern because of brighter on upper left corner and bottom right corner. When you convolute the input image with a 3x3 filter or horizontal detector, you'll end up a 4x4 output image, where the 30s denote a positive edge, i.e., bright pixels on top and dark pixels on the bottom, whereas minus 30s denote a negative edge.

The 3x3 edge detector we use is just one possible choice, historically in computer vision literature, there was a fair amount of debate about what is the best set of numbers to use. So, here's something else you could use, in Sobel filter it puts a little bit weight on the central pixel and this makes it may be more robust, and in Scharr filter, it uses other assess numbers which have other slightly different properties.

And with the rise of deep learning, one of the things we learned is that when you really wanna detect edges in some complicated image maybe you don't need to handpick these 9 assess numbers in the filter, maybe you can learn them and treat these 9 numbers as parameters which you can learn them using backpropagation. The goal is to learn these 9 parameters so that when you take the input image and convolve it with your 3x3 filter this gives you a good edge detector. The backpropagation can choose to learn some filters even better at capturing the statistics of your data than any of these hand-coded filters, and rather than vertical/horizontal edges maybe you can learn to detect edges at 45 degrees, or 70 degrees or 73 degrees or whatever orientation you choose.

Chen Yang

要查看或添加评论，请登录

Chen Yang??????的更多文章

Practice on using ansible 2.4 to deploy HDP 2.6.4.0

2018年4月18日

Practice on using ansible 2.4 to deploy HDP 2.6.4.0

I'm practicing ansible installation of hdp 2.6.
Deep learning--CNN: localization in object detection (1/2)

2018年4月3日

Deep learning--CNN: localization in object detection (1/2)

Deep learning has been successfully applied to computer vision, speech recognition, online advertising, logistics many…
Deep learning--CNN: classic ConvNet, residual networks, inception network

2018年3月20日

Deep learning--CNN: classic ConvNet, residual networks, inception network

There are some classic neural network architectures LeNet-5, AlexNet, and VGG-16. First, let's look at the following…

1 条评论
Deep learning--CNN: Padding, strided convolution, convolution over volume, pooling layer

2018年3月12日

Deep learning--CNN: Padding, strided convolution, convolution over volume, pooling layer

In order to build deep neural networks, one modification to the basic convolutional operation that you need to really…
Deep learning: End-to-end deep learning

2018年3月7日

Deep learning: End-to-end deep learning

One of the exciting recent developments in deep learning has been a rise of end-to-end deep learning. Basically, there…
Deep learning: Transfer learning, multitask learning

2018年3月6日

Deep learning: Transfer learning, multitask learning

One of the powerful ideas of deep learning is that sometimes you can take knowledge, the neural network has learned…

1 条评论
Deep learning: Training and testing on different distributions

2018年3月5日

Deep learning: Training and testing on different distributions

If you're working on a brand new machine learning application, one of the pieces of advice is that you should build…
Deep learning: Error analysis

2018年3月4日

Deep learning: Error analysis

You've heard about orthogonalization, how to set up your dev and test, human-level performance as a proxy for Bayes…
Deep learning: human-level performance

2018年3月3日

Deep learning: human-level performance

In the last few years, there were a lot of talks about comparing the machine learning systems to human-level…
Deep learning: orthogonalization, evaluation metrics, train/dev/test set

2018年3月2日

Deep learning: orthogonalization, evaluation metrics, train/dev/test set

In the example of the earlier TV set, orthogonalization refers to that the TV designers had designed these knobs so…

See all articles

Deep learning--CNN: Edge detection

Chen Yang??????

Machine & Deep Learning | Big Data Cloud

Chen Yang??????的更多文章

社区洞察

其他会员也浏览了

DEEP LEARNING INTERVIEW QUESTIONS

Automating Neural Network Configuration with Keras-Tuner

Top 10 Activation Functions in Deep Learning

What's the basis of modern Deep Learning Models?

Deep Learning Demystified: Understanding Neural Networks and Their Applications

Unleashing ResNet: A Game-Changer in Deep Learning

Beyond Basics: Advancing from Single to Multiple Perceptrons in Deep Learning

Multi-Scale Context Aggregation by Dilated Convolution

Exploring Advanced Convolutional Layers in Deep Learning

Deep Learning and Neural Networks: Unleashing the Power of Artificial Intelligence

Chen Yang??????的更多文章

Practice on using ansible 2.4 to deploy HDP 2.6.4.0

Deep learning--CNN: localization in object detection (1/2)

Deep learning--CNN: classic ConvNet, residual networks, inception network

Deep learning--CNN: Padding, strided convolution, convolution over volume, pooling layer

Deep learning: End-to-end deep learning

Deep learning: Transfer learning, multitask learning

Deep learning: Training and testing on different distributions

Deep learning: Error analysis

Deep learning: human-level performance

Deep learning: orthogonalization, evaluation metrics, train/dev/test set

社区洞察

其他会员也浏览了

DEEP LEARNING INTERVIEW QUESTIONS

Automating Neural Network Configuration with Keras-Tuner

Top 10 Activation Functions in Deep Learning

What's the basis of modern Deep Learning Models?

Deep Learning Demystified: Understanding Neural Networks and Their Applications

Unleashing ResNet: A Game-Changer in Deep Learning

Beyond Basics: Advancing from Single to Multiple Perceptrons in Deep Learning

Multi-Scale Context Aggregation by Dilated Convolution

Exploring Advanced Convolutional Layers in Deep Learning

Deep Learning and Neural Networks: Unleashing the Power of Artificial Intelligence