Understanding convolutional neural networks from a maths perspective as an exception to multilayer perceptrons

Understanding convolutional neural networks from a maths perspective as an exception to multilayer perceptrons

On 30 Nov 2016, just before I launched the Oxford course in 2017, I purchased a (relatively) expensive book called Deep Learning.

I have read it extensively.

Its one of the few books that approaches the subject of machine learning and deep learning from a maths perspective. I always recommend you buy it but there is an official version available for free here (Deep Learning Book)

Maths allows you to understand ideas in a simple and elegant way.

Here is one example from this book

It contains a sentence

Convolutional networks are simply neural networks that use convolution in place of general matrix multiplication in at least one of their layers.

I really like this and have used it in my teaching because you can use it to explain Convolutional neural networks as an exception to multi layer perceptrons,

We have already seen that you can explain multilayer perceptrons in terms of general matrix multiplication where we saw that in traditional fully connected neural networks, also known as dense networks, each neuron in a layer is connected to every neuron in the previous layer through weights.

The computations in these networks are typically performed using general matrix multiplication, where the input data is represented as a vector or matrix, and matrix multiplications are used to propagate information through the network.

On the other hand, convolutional neural networks introduce the concept of convolution, which is a specialized operation that replaces general matrix multiplication in at least one of their layers. Specifically, convolution is used in the convolutional layers of CNNs.

Unlike fully connected neural networks where each neuron is connected to all neurons in the previous layer, CNNs employ local connectivity. In convolutional layers, each neuron is only connected to a small receptive field of the input data, defined by the size of the kernel. This local connectivity allows the network to capture local patterns and spatial relationships.

By utilizing convolution instead of general matrix multiplication, CNNs leverage the spatial structure of the input data, such as images or sequences, more effectively.

In summary, the statement highlights that convolutional networks, or CNNs, are a specific type of neural network that employ convolution operations instead of general matrix multiplication in at least one layer. This choice of convolution allows CNNs to effectively handle spatially structured data and extract relevant features from it.

Thus, this sentence is a great way to introduce a new idea(convolution) in context of an older idea(matrix multiplication)

Proof of my purchase from amazon :) I believe it was even more expensive at that time!


Leon P.

Solutions Architect @ Databricks

8 个月

This 1 is great, definitely recommend

回复
Venkat dharaneswar reddy

Currently pursuing my b. tech in, Artificial intelligence and data science, in Amrita Vishwa Vidyapeetham

8 个月

Thanks for recommending this book, I was also searching for the books which will explain from a mathematical perspective. I am really interested in reading books. Thank you

HENRY STEVENSON-PEREZ

Medical Doctor in Immunology-Medicine & Knowledge-Physics Researcher at SKMRI

8 个月

Please consider reading this week’s Cagle Report, that defines the irreconcilable (and real-breaking) irreconcilable differences between AI-machines & Trust: https://www.dhirubhai.net/pulse/when-trust-dies-kurt-cagle-mocmc/?trackingId=AIRiv%2Fm7Rm6k1LTl4DxUvg%3D%3D Sincerely- Your SKMRI Knowledge-Physics Lab Colleagues SKMRI.org

要查看或添加评论,请登录

Ajit Jaokar的更多文章

社区洞察

其他会员也浏览了