登录查看更多内容

Implementation from Scratch: Forward and Back Propagation of a Pooling Layer

Maria Alejandra Coy Ulloa

Business Analyst I Data Analyst I IT Product Development I Agile Methodologies I Stakeholder Management I Decision-Making I Consultancy I Product creation

发布日期: 2020年6月15日

You could find the implementation of the code for the forward propagation here and the backpropagation here

A Convolutional layer in a convolutional neural network represents and maps the features of an input image. The output feature map of each convolutional process is highly sensitive to the location of the feature in the input, and it represents a problem. One way to manage it is to downsample the feature maps, making the resulting down-sampled feature maps more robust.

In this sense, the Pooling layers provide us the possibility to summarize the presence of those features in the input, which means, it reduces the spatial dimension of the input volume for the next layers. Affecting just the weight and height but not depth and there are no learnable parameters in this layer. Two common pooling methods are average pooling and max pooling.

Usually, a pooling layer is a new layer added after the convolutional layer. Specifically, after a nonlinearity (e.g. ReLU) has been applied to the feature maps output by a convolutional layer; for example, the layers in a model may look as follows:

Image => Convolutional Layer => Nonlinearity => Pooling Layer

The pooling layer takes an input volume of size w1×h1×c1 and the two hyperparameters are used: filter and stride, and the output volume is of size is w2xh2xc2 where w2 = (W1?F) / S+1, h2 = (h1?f) / s+1, c1 and c2 are same.

So we could infer, that when we are doing the forward and backpropagation these layers need to be threaded as a different layer.

Forward Propagation

The max pool layer or the average pool layer is similar to the convolution layer. But in this case, we select the max values or the mean in the receptive fields of the input, saving the indices, and then producing a summarized output volume.

    #!/usr/bin/env python3
	"""Convolutional Neural Networks"""
	
    import numpy as np
	import matplotlib.pyplot as plt



	def pool_forward(A_prev, kernel_shape, stride=(1, 1), mode='max'):
	    """pool forward prop convolutional 3D image, RGB image - color
	    
        Arg:
	       A_prev: contains the output of prev layer (m, h_prev, w_prev, c_prev)
	       W: filter for the convolution (kh, kw)
	       stride: tuple (sh, sw)
	       mode: indicates if max or avg
	    
        Return: output of the pooling layer
	    """
	    m, h_prev, w_prev, c_prev = A_prev.shape
	    k_h, k_w = kernel_shape
	

	    out_h = int(((h_prev - k_h) / (stride[0])) + 1)
	    out_w = int(((w_prev - k_w) / (stride[1])) + 1)
	    output_conv = np.zeros((m, out_h, out_w, c_prev))
	    m_A_prev = np.arange(0, m)
	

	    for i in range(out_h):
	        for j in range(out_w):
	            if mode == 'max':
	                output_conv[m_A_prev, i, j] = np.max(
	                    A_prev[
	                        m_A_prev,
	                        i*(stride[0]):k_h+(i*(stride[0])),
	                        j*(stride[1]):k_w+(j*(stride[1]))], axis=(1, 2))
	            if mode == 'avg':
	                output_conv[m_A_prev, i, j] = np.mean(
	                    A_prev[
	                        m_A_prev,
	                        i*(stride[0]):k_h+(i*(stride[0])),
	                        j*(stride[1]):k_w+(j*(stride[1]))], axis=(1, 2))

	    
        
        return output_conv

if __name__ == "__main__":
	    np.random.seed(0)
	    lib = np.load('../data/MNIST.npz')
	    X_train = lib['X_train']
	    m, h, w = X_train.shape
	    X_train_a = X_train.reshape((-1, h, w, 1))
	    X_train_b = 1 - X_train_a
	    X_train_c = np.concatenate((X_train_a, X_train_b), axis=3)
	

	    print(X_train_c.shape)
	    plt.imshow(X_train_c[0, :, :, 0])
	    plt.show()
	    plt.imshow(X_train_c[0, :, :, 1])
	    plt.show()
	    A = pool_forward(X_train_c, (2, 2), stride=(2, 2))
	    print(A.shape)
	    plt.imshow(A[0, :, :, 0])
	    plt.show()
	    plt.imshow(A[0, :, :, 1])

	    plt.
        show()

Backward Propagation

For the backward in a max pool layer, we pass of the gradient, we start with a zero matrix and fill the max index of this matrix with the gradient from above. On the other hand, if we tread it as an average pool layer, we need to fill each cell with the value of the gradient from above.

    #!/usr/bin/env python3
	"""Convolutional Neural Networks"""

	import numpy as np
	

	
	def pool_backward(dA, A_prev, kernel_shape, stride=(1, 1), mode='max'):
	    """back prop convolutional 3D image, RGB image - color
	    
        Arg:
	       dA: containing the partial derivatives (m, h_new, w_new, c_new)
	       A_prev: contains the output of prev layer (m, h_prev, w_prev, c)
	       kernel.shape: filter dimensions tupple (kh, kw)
	       stride: tuple (sh, sw)
	       mode: max or avg
	    
        Returns: parcial dev prev layer (dA_prev)
	    """
	    
        k_h, k_w = kernel_shape
	    m, h_new, w_new, c_new = dA.shape
	    m, h_x, w_x, c_prev = A_prev.shape
	    s_h, s_w = stride
	

	    dx = np.zeros_like(A_prev)
	

	    for i in range(m):
	        for h in range(h_new):
	            for w in range(w_new):
	                for f in range(c_new):
	                    if mode == 'max':
	                        tmp = A_prev[i, h*s_h:k_h+(h*s_h),
	                                     w*s_w:k_w+(w*s_w), f]
	                        mask = (tmp == np.max(tmp))
	                        dx[i,
	                           h*(s_h):(h*(s_h))+k_h,
	                           w*(s_w):(w*(s_w))+k_w,
	                           f] += dA[i, h, w, f] * mask
	                    if mode == 'avg':
	                        dx[i,
	                           h*(s_h):(h*(s_h))+k_h,
	                           w*(s_w):(w*(s_w))+k_w,
	                           f] += (dA[i, h, w, f])/k_h/k_w
	

	    return dx



if __name__ == "__main__":

	    np.random.seed(0)
	    lib = np.load('../data/MNIST.npz')
	    X_train = lib['X_train']
	    _, h, w = X_train.shape
	    X_train_a = X_train[:10].reshape((-1, h, w, 1))
	    X_train_b = 1 - X_train_a
	    X_train_c = np.concatenate((X_train_a, X_train_b), axis=3)
	    
	    dA = np.random.randn(10, h // 3, w // 3, 2)

	    
        print(pool_backward(dA, X_train_c, (3, 3), stride=(3, 3)))

Hope this article helps you to understand the intuition behind the forward and backpropagation in a pooling layer, if you have any comment or fix please do not hesitate to contact me, or send me an email.

You could find more projects and machine learning paper implementation on my GitHub.

Yusei Fujikura

Data Analyst @ UCLA Residential Life Learning Centers | Looking for Full Time Role

10 个月

Love this article.

要查看或添加评论，请登录

Maria Alejandra Coy Ulloa的更多文章

Forward and Back Propagation over a CNN... code from Scratch!!

2020年6月11日

Forward and Back Propagation over a CNN... code from Scratch!!

The name “convolutional neural network” indicates that the network employs a mathematical operation called convolution.…

5 条评论
Transfer Learning using Keras

2020年4月10日

Transfer Learning using Keras

The transfer learning is a technic based on how the human being acquires knowledge or gain while learning about one…
???WEB POSTMORTEM!!!

2019年10月7日

???WEB POSTMORTEM!!!

Do not get panic!!! Let′s get into the postmortem style The key to learning from our mistakes is to document our…
What Happens When You Type an URL in Your Browser and Press Enter?

2019年8月26日

What Happens When You Type an URL in Your Browser and Press Enter?

The internet has became a part of our lives, and typing the URL in the browser of Google and search for a websites is a…
The Internet of the Things - IoT

2019年7月25日

The Internet of the Things - IoT

Basically, the Internet of Things is actually a pretty simple concept, it means taking all the things in the world and…
HEY Grandma... Do not worry, Artificial Intelligence is easy!

2019年6月25日

HEY Grandma... Do not worry, Artificial Intelligence is easy!

Artificial Intelligence is applied when a machine starts to mimic the behavior of the human. The constant development…
PYTHON 3

2019年5月29日

PYTHON 3

Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. With Python it is…
STATIC VS DYNAMIC LIBRARIES

2019年5月7日

STATIC VS DYNAMIC LIBRARIES

A library in a programming language is a collection of pre-compiled routines that a program can use. The routines…
LIBRARIES IN A COMPUTER LANGUAGE

2019年3月2日

LIBRARIES IN A COMPUTER LANGUAGE

In a library there is the code previously written and it has functions already designed to be used in other files as…
Compiling a C file using gcc

2019年2月6日

Compiling a C file using gcc

To Compiling c programs in Ubuntu using the compiler command gcc we need to follow the next stops: We first need to…

See all articles

Implementation from Scratch: Forward and Back Propagation of a Pooling Layer

Maria Alejandra Coy Ulloa

Business Analyst I Data Analyst I IT Product Development I Agile Methodologies I Stakeholder Management I Decision-Making I Consultancy I Product creation

Forward Propagation

Backward Propagation

Maria Alejandra Coy Ulloa的更多文章

社区洞察

其他会员也浏览了

Emergent intelligence in nature

Layer Normalization

Stock Forecasting with Transformer Architecture & Attention Mechanism

(Re-)Imag(in)ing Price Trends (Jiang, Kelly and Xiu, 2023)

BxD Primer Series: Apriori Pattern Search Algorithm

Squeeze-and-Excitation Networks (SENet)

Using Depthwise Separable convolutions in TensorFlow

Object Detection - YOLO-X

Forward Propagation

Backward Propagation

Maria Alejandra Coy Ulloa的更多文章

Forward and Back Propagation over a CNN... code from Scratch!!

Transfer Learning using Keras

???WEB POSTMORTEM!!!

What Happens When You Type an URL in Your Browser and Press Enter?

The Internet of the Things - IoT

HEY Grandma... Do not worry, Artificial Intelligence is easy!

PYTHON 3

STATIC VS DYNAMIC LIBRARIES

LIBRARIES IN A COMPUTER LANGUAGE

Compiling a C file using gcc

社区洞察

其他会员也浏览了

Emergent intelligence in nature

Layer Normalization

Stock Forecasting with Transformer Architecture & Attention Mechanism

(Re-)Imag(in)ing Price Trends (Jiang, Kelly and Xiu, 2023)

BxD Primer Series: Apriori Pattern Search Algorithm

Squeeze-and-Excitation Networks (SENet)

Using Depthwise Separable convolutions in TensorFlow

Object Detection - YOLO-X