Understanding convolutional neural networks from a maths perspective as an exception to multilayer perceptrons
On 30 Nov 2016, just before I launched the Oxford course in 2017, I purchased a (relatively) expensive book called Deep Learning.
I have read it extensively.
Its one of the few books that approaches the subject of machine learning and deep learning from a maths perspective. I always recommend you buy it but there is an official version available for free here (Deep Learning Book)
Maths allows you to understand ideas in a simple and elegant way.
Here is one example from this book
It contains a sentence
Convolutional networks are simply neural networks that use convolution in place of general matrix multiplication in at least one of their layers.
I really like this and have used it in my teaching because you can use it to explain Convolutional neural networks as an exception to multi layer perceptrons,
We have already seen that you can explain multilayer perceptrons in terms of general matrix multiplication where we saw that in traditional fully connected neural networks, also known as dense networks, each neuron in a layer is connected to every neuron in the previous layer through weights.
领英推荐
The computations in these networks are typically performed using general matrix multiplication, where the input data is represented as a vector or matrix, and matrix multiplications are used to propagate information through the network.
On the other hand, convolutional neural networks introduce the concept of convolution, which is a specialized operation that replaces general matrix multiplication in at least one of their layers. Specifically, convolution is used in the convolutional layers of CNNs.
Unlike fully connected neural networks where each neuron is connected to all neurons in the previous layer, CNNs employ local connectivity. In convolutional layers, each neuron is only connected to a small receptive field of the input data, defined by the size of the kernel. This local connectivity allows the network to capture local patterns and spatial relationships.
By utilizing convolution instead of general matrix multiplication, CNNs leverage the spatial structure of the input data, such as images or sequences, more effectively.
In summary, the statement highlights that convolutional networks, or CNNs, are a specific type of neural network that employ convolution operations instead of general matrix multiplication in at least one layer. This choice of convolution allows CNNs to effectively handle spatially structured data and extract relevant features from it.
Thus, this sentence is a great way to introduce a new idea(convolution) in context of an older idea(matrix multiplication)
Proof of my purchase from amazon :) I believe it was even more expensive at that time!
Solutions Architect @ Databricks
8 个月This 1 is great, definitely recommend
Currently pursuing my b. tech in, Artificial intelligence and data science, in Amrita Vishwa Vidyapeetham
8 个月Thanks for recommending this book, I was also searching for the books which will explain from a mathematical perspective. I am really interested in reading books. Thank you
Medical Doctor in Immunology-Medicine & Knowledge-Physics Researcher at SKMRI
8 个月Please consider reading this week’s Cagle Report, that defines the irreconcilable (and real-breaking) irreconcilable differences between AI-machines & Trust: https://www.dhirubhai.net/pulse/when-trust-dies-kurt-cagle-mocmc/?trackingId=AIRiv%2Fm7Rm6k1LTl4DxUvg%3D%3D Sincerely- Your SKMRI Knowledge-Physics Lab Colleagues SKMRI.org