Implementation from Scratch: Forward and Back Propagation of a Pooling Layer
https://deeplizard.com/learn/video/XfYmia3q2Ow

Implementation from Scratch: Forward and Back Propagation of a Pooling Layer

You could find the implementation of the code for the forward propagation here and the backpropagation here

A Convolutional layer in a convolutional neural network represents and maps the features of an input image. The output feature map of each convolutional process is highly sensitive to the location of the feature in the input, and it represents a problem. One way to manage it is to downsample the feature maps, making the resulting down-sampled feature maps more robust.

In this sense, the Pooling layers provide us the possibility to summarize the presence of those features in the input, which means, it reduces the spatial dimension of the input volume for the next layers. Affecting just the weight and height but not depth and there are no learnable parameters in this layer. Two common pooling methods are average pooling and max pooling.

Usually, a pooling layer is a new layer added after the convolutional layer. Specifically, after a nonlinearity (e.g. ReLU) has been applied to the feature maps output by a convolutional layer; for example, the layers in a model may look as follows:

Image => Convolutional Layer => Nonlinearity => Pooling Layer

No alt text provided for this image

The pooling layer takes an input volume of size w1×h1×c1 and the two hyperparameters are used: filter and stride, and the output volume is of size is w2xh2xc2 where w2 = (W1?F) / S+1,  h2 = (h1?f) / s+1, c1 and c2 are same.

No alt text provided for this image

So we could infer, that when we are doing the forward and backpropagation these layers need to be threaded as a different layer.

Forward Propagation

The max pool layer or the average pool layer is similar to the convolution layer. But in this case, we select the max values or the mean in the receptive fields of the input, saving the indices, and then producing a summarized output volume.

No alt text provided for this image
    #!/usr/bin/env python3
	"""Convolutional Neural Networks"""
	
    import numpy as np
	import matplotlib.pyplot as plt



	def pool_forward(A_prev, kernel_shape, stride=(1, 1), mode='max'):
	    """pool forward prop convolutional 3D image, RGB image - color
	    
        Arg:
	       A_prev: contains the output of prev layer (m, h_prev, w_prev, c_prev)
	       W: filter for the convolution (kh, kw)
	       stride: tuple (sh, sw)
	       mode: indicates if max or avg
	    
        Return: output of the pooling layer
	    """
	    m, h_prev, w_prev, c_prev = A_prev.shape
	    k_h, k_w = kernel_shape
	

	    out_h = int(((h_prev - k_h) / (stride[0])) + 1)
	    out_w = int(((w_prev - k_w) / (stride[1])) + 1)
	    output_conv = np.zeros((m, out_h, out_w, c_prev))
	    m_A_prev = np.arange(0, m)
	

	    for i in range(out_h):
	        for j in range(out_w):
	            if mode == 'max':
	                output_conv[m_A_prev, i, j] = np.max(
	                    A_prev[
	                        m_A_prev,
	                        i*(stride[0]):k_h+(i*(stride[0])),
	                        j*(stride[1]):k_w+(j*(stride[1]))], axis=(1, 2))
	            if mode == 'avg':
	                output_conv[m_A_prev, i, j] = np.mean(
	                    A_prev[
	                        m_A_prev,
	                        i*(stride[0]):k_h+(i*(stride[0])),
	                        j*(stride[1]):k_w+(j*(stride[1]))], axis=(1, 2))
	    
        
        return output_conv
             

if __name__ == "__main__":
	    np.random.seed(0)
	    lib = np.load('../data/MNIST.npz')
	    X_train = lib['X_train']
	    m, h, w = X_train.shape
	    X_train_a = X_train.reshape((-1, h, w, 1))
	    X_train_b = 1 - X_train_a
	    X_train_c = np.concatenate((X_train_a, X_train_b), axis=3)
	

	    print(X_train_c.shape)
	    plt.imshow(X_train_c[0, :, :, 0])
	    plt.show()
	    plt.imshow(X_train_c[0, :, :, 1])
	    plt.show()
	    A = pool_forward(X_train_c, (2, 2), stride=(2, 2))
	    print(A.shape)
	    plt.imshow(A[0, :, :, 0])
	    plt.show()
	    plt.imshow(A[0, :, :, 1])
	    plt.
        show()
        
              

Backward Propagation

For the backward in a max pool layer, we pass of the gradient, we start with a zero matrix and fill the max index of this matrix with the gradient from above. On the other hand, if we tread it as an average pool layer, we need to fill each cell with the value of the gradient from above.

    #!/usr/bin/env python3
	"""Convolutional Neural Networks"""

	import numpy as np
	

	
	def pool_backward(dA, A_prev, kernel_shape, stride=(1, 1), mode='max'):
	    """back prop convolutional 3D image, RGB image - color
	    
        Arg:
	       dA: containing the partial derivatives (m, h_new, w_new, c_new)
	       A_prev: contains the output of prev layer (m, h_prev, w_prev, c)
	       kernel.shape: filter dimensions tupple (kh, kw)
	       stride: tuple (sh, sw)
	       mode: max or avg
	    
        Returns: parcial dev prev layer (dA_prev)
	    """
	    
        k_h, k_w = kernel_shape
	    m, h_new, w_new, c_new = dA.shape
	    m, h_x, w_x, c_prev = A_prev.shape
	    s_h, s_w = stride
	

	    dx = np.zeros_like(A_prev)
	

	    for i in range(m):
	        for h in range(h_new):
	            for w in range(w_new):
	                for f in range(c_new):
	                    if mode == 'max':
	                        tmp = A_prev[i, h*s_h:k_h+(h*s_h),
	                                     w*s_w:k_w+(w*s_w), f]
	                        mask = (tmp == np.max(tmp))
	                        dx[i,
	                           h*(s_h):(h*(s_h))+k_h,
	                           w*(s_w):(w*(s_w))+k_w,
	                           f] += dA[i, h, w, f] * mask
	                    if mode == 'avg':
	                        dx[i,
	                           h*(s_h):(h*(s_h))+k_h,
	                           w*(s_w):(w*(s_w))+k_w,
	                           f] += (dA[i, h, w, f])/k_h/k_w
	

	    return dx



if __name__ == "__main__":

	    np.random.seed(0)
	    lib = np.load('../data/MNIST.npz')
	    X_train = lib['X_train']
	    _, h, w = X_train.shape
	    X_train_a = X_train[:10].reshape((-1, h, w, 1))
	    X_train_b = 1 - X_train_a
	    X_train_c = np.concatenate((X_train_a, X_train_b), axis=3)
	    
	    dA = np.random.randn(10, h // 3, w // 3, 2)
	    
        print(pool_backward(dA, X_train_c, (3, 3), stride=(3, 3)))

Hope this article helps you to understand the intuition behind the forward and backpropagation in a pooling layer, if you have any comment or fix please do not hesitate to contact me, or send me an email.

You could find more projects and machine learning paper implementation on my GitHub.

Yusei Fujikura

Data Analyst @ UCLA Residential Life Learning Centers | Looking for Full Time Role

10 个月

Love this article.

回复

要查看或添加评论,请登录

Maria Alejandra Coy Ulloa的更多文章

  • Forward and Back Propagation over a CNN... code from Scratch!!

    Forward and Back Propagation over a CNN... code from Scratch!!

    The name “convolutional neural network” indicates that the network employs a mathematical operation called convolution.…

    5 条评论
  • Transfer Learning using Keras

    Transfer Learning using Keras

    The transfer learning is a technic based on how the human being acquires knowledge or gain while learning about one…

  • ???WEB POSTMORTEM!!!

    ???WEB POSTMORTEM!!!

    Do not get panic!!! Let′s get into the postmortem style The key to learning from our mistakes is to document our…

  • What Happens When You Type an URL in Your Browser and Press Enter?

    What Happens When You Type an URL in Your Browser and Press Enter?

    The internet has became a part of our lives, and typing the URL in the browser of Google and search for a websites is a…

  • The Internet of the Things - IoT

    The Internet of the Things - IoT

    Basically, the Internet of Things is actually a pretty simple concept, it means taking all the things in the world and…

  • HEY Grandma... Do not worry, Artificial Intelligence is easy!

    HEY Grandma... Do not worry, Artificial Intelligence is easy!

    Artificial Intelligence is applied when a machine starts to mimic the behavior of the human. The constant development…

  • PYTHON 3

    PYTHON 3

    Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. With Python it is…

  • STATIC VS DYNAMIC LIBRARIES

    STATIC VS DYNAMIC LIBRARIES

    A library in a programming language is a collection of pre-compiled routines that a program can use. The routines…

  • LIBRARIES IN A COMPUTER LANGUAGE

    LIBRARIES IN A COMPUTER LANGUAGE

    In a library there is the code previously written and it has functions already designed to be used in other files as…

  • Compiling a C file using gcc

    Compiling a C file using gcc

    To Compiling c programs in Ubuntu using the compiler command gcc we need to follow the next stops: We first need to…

社区洞察

其他会员也浏览了