Image Processing: Convolution filters and Calculation of image gradients
Sample image showing kernel operation

Image Processing: Convolution filters and Calculation of image gradients

Convolution filters are a fundamental building block in image processing and computer vision. They are used to extract specific features from an image by applying a small matrix of numbers, called the kernel or filter, to the image. The result of the convolution is a new image, where each pixel is a weighted sum of the pixels in the original image. The weights are determined by the values in the kernel.

A convolution filter can be thought of as a sliding window that moves across the image, applying the kernel to small regions of the image. The convolution operation is performed by element-wise multiplying the kernel with the region of the image that is currently under the window, and then summing the results. This process is repeated for every location in the image, resulting in a new image where each pixel is the sum of the products of the kernel values and the corresponding pixel values in the original image.

Here is an example of a convolution filter in Python using the popular image processing library OpenCV:

import numpy as np
import cv2


# Load the image
image = cv2.imread("good_boy.jpeg")


# Define the kernel
kernel = np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]])


# Apply the convolution filter
result = cv2.filter2D(image, -1, kernel)


# Display the original and filtered images
cv2.imshow("Original Image", image)
cv2.imshow("Filtered Image", result)
cv2.waitKey(0)
cv2.destroyAllWindows()


        

In the above example, the convolution filter is applied to the image using the cv2.filter2D() function. The first argument is the image, the second argument is the depth of the output image (-1 means the same depth as the input image), and the third argument is the kernel.

kernel defined in the example is a simple edge detection filter known as the "Laplacian of Gaussian" (LoG) filter, which is used to detect edges in an image by computing the second derivative.

Below we take a look at the result of applying this filter :

No alt text provided for this image
Before applying the edge detection filter
No alt text provided for this image
After applying the edge detction filter

There are many types of convolution filters such as edge detection filters (Sobel, Scharr, Canny), blur filters (Gaussian, Median, Bilateral), and sharpen filters (Laplacian, Unsharp Mask), each with a specific purpose and kernel.

Convolution filters are a powerful technique for extracting features from images and are widely used in various applications such as image enhancement, object detection, and image recognition.

As described above we can broadly divide the main filters into three categories :

  • Edge detection filters (Sobel, Scharr, Canny)
  • Blur filters (Gaussian, Median, Bilateral)
  • Sharpen filters (Laplacian, Unsharp Mask)

In this article we will mainly focus on edge detection.

Theory of edge detection :

1. An edge in an image is represented by a sudden change in the intensity value of the pixel. This change in intensity value can be described by derivatives. The greater is the change in gradient, the larger is the change in the image.

2. In a 1-dimensional image, the edge can be represented by a jump in the intensity value.

3. The jump in the intensity value can be observed more carefully by calculating the first derivative.

4. The edge detection method works on the principle of identifying pixels in the image with a higher gradient value than its neighbours.

In this article we will be discussing Sobel and Scharr filters :

Sobel filters : Sobel filters are used to detect edges in an image. They are based on the Sobel operator, which is a discrete differentiation operator that can be applied to an image to obtain the gradient magnitude of the image. The Sobel operator uses two 3x3 kernels, one for detecting horizontal edges and one for detecting vertical edges. The kernels are typically defined as follows:

# Horizontal Sobel kernel
kernel_x = np.array([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]])


# Vertical Sobel kernel
kernel_y = np.array([[-1, -2, -1], [0, 0, 0], [1, 2, 1]])
        

To apply the Sobel filter to an image, the image is convolved with these kernels using the convolution operation. The convolution operation is typically implemented as a nested for loop that slides the kernel over the image and multiplies the kernel values with the corresponding pixel values in the image. The result of the convolution is typically a 2D array where each element represents the gradient magnitude of the corresponding pixel in the original image.

Here is some sample python code that demonstrates how to use the Sobel filter to detect edges in an image:

from matplotlib.image import imrea
import matplotlib.pyplot as plt
import numpy as np


# #---------------------------------------------------------------------------------------------------------------------
# PART I - Transforming an image from color to grayscale
# #---------------------------------------------------------------------------------------------------------------------


# Here we import the image file as an array of shape (nx, ny, nz)
image_file = '/Users/abhishake/Desktop/good_boy.jpeg'
input_image = imread(image_file)  # this is the array representation of the input image
[nx, ny, nz] = np.shape(input_image)  # nx: height, ny: width, nz: colors (RGB)


# Extracting each one of the RGB components
r_img, g_img, b_img = input_image[:, :, 0], input_image[:, :, 1], input_image[:, :, 2]


# The following operation will take weights and parameters to convert the color image to grayscale
gamma = 1.400  # a parameter
r_const, g_const, b_const = 0.2126, 0.7152, 0.0722  # weights for the RGB components respectively
grayscale_image = r_const * r_img ** gamma + g_const * g_img ** gamma + b_const * b_img ** gamma


# This command will display the grayscale image alongside the original image
fig1 = plt.figure(1)
ax1, ax2 = fig1.add_subplot(121), fig1.add_subplot(122)
ax1.imshow(input_image)
ax2.imshow(grayscale_image, cmap=plt.get_cmap('gray'))
fig1.show()
plt.show()

# #---------------------------------------------------------------------------------------------------------------------
# PART II - Applying the Sobel operator
# #---------------------------------------------------------------------------------------------------------------------


"""
The kernels Gx and Gy can be thought of as a differential operation in the "input_image" array in the directions x and y 
respectively. These kernels are represented by the following matrices:
      _               _                   _                _
     |                 |                 |                  |
     | 1.0   0.0  -1.0 |                 |  1.0   2.0   1.0 |
Gx = | 2.0   0.0  -2.0 |    and     Gy = |  0.0   0.0   0.0 |
     | 1.0   0.0  -1.0 |                 | -1.0  -2.0  -1.0 |
     |_               _|                 |_                _|
"""


# Here we define the matrices associated with the Sobel filter
Gx = np.array([[1.0, 0.0, -1.0], [2.0, 0.0, -2.0], [1.0, 0.0, -1.0]])
Gy = np.array([[1.0, 2.0, 1.0], [0.0, 0.0, 0.0], [-1.0, -2.0, -1.0]])
[rows, columns] = np.shape(grayscale_image)  # we need to know the shape of the input grayscale image
sobel_filtered_image = np.zeros(shape=(rows, columns))  # initialization of the output image array (all elements are 0)


# Now we "sweep" the image in both x and y directions and compute the output
for i in range(rows - 2):
    for j in range(columns - 2):
        gx = np.sum(np.multiply(Gx, grayscale_image[i:i + 3, j:j + 3]))  # x direction
        gy = np.sum(np.multiply(Gy, grayscale_image[i:i + 3, j:j + 3]))  # y direction
        sobel_filtered_image[i + 1, j + 1] = np.sqrt(gx ** 2 + gy ** 2)  # calculate the "hypotenuse"


# Display the original image and the Sobel filtered image
fig2 = plt.figure(2)
ax1, ax2 = fig2.add_subplot(121), fig2.add_subplot(122)
ax1.imshow(input_image)
ax2.imshow(sobel_filtered_image, cmap=plt.get_cmap('gray'))
fig2.show()
plt.show()
        

This Python code performs two main tasks:

  1. Transforming an image from color to grayscale:

  • It imports an image file and converts it into an array representation.
  • Extracts the Red, Green, and Blue (RGB) components of the image.
  • Applies weights and a parameter to the RGB components to convert the color image to grayscale.
  • Displays the original image and the grayscale image side by side.

2. Applying the Sobel operator:

  • Defines two matrices, Gx and Gy, associated with the Sobel filter.
  • Initializes an output image array with all elements set to 0.
  • "Sweeps" the grayscale image in both the x and y directions and applies the Sobel operator to compute the output.
  • Displays the original grayscale image and the Sobel filtered image side by side.

No alt text provided for this image
Convert image to grayscale
No alt text provided for this image
Apply sobel filter


2. Scharr Filters: The Scharr operator is used as a method to identify and highlight gradient edges or features of an image using the 1st derivative. It is commonly used to identify gradients along the x-axis (dx = 1, dy = 0) and y-axis (dx = 0, dy = 1). The performance of the Scharr operator is quite similar to the Sobel operator.

The Scharr operator is an enhancement of the difference between the Sobel operator, and the two are the same as the principle of the edge of the image. It?increases the difference between the pixel value by amplifying the weight coefficients.

The Scharr gradient for an image can be calculated in different directions depending on the value of the dx and dy parameters

1.?X-direction Scharr derivative: The Scharr derivative is computed in the X-direction by setting the value for the x derivative as 1 and the value for y derivative as 0. Therefore, dx will be equal to 1 and dy will be equal to 0.

2.?Y-direction Scharr derivative: The Scharr derivative is computed in the Y-direction by setting the value for the x derivative as 0 and the value for y derivative as 1. Therefore, dx will be equal to 0 and dy will be equal to 1.

3.?X and Y direction Scharr derivative: The Scharr derivative cannot be computed for both X and Y directions simultaneously.

# Importing OpenCV
import cv2
# Importing numpy
import numpy as np
# Importing matplotlib.pyplot
import matplotlib.pyplot as plt
# Reading the image
img = cv2.imread(r'/Users/abhishake/Desktop/Sudoku_Puzzle_by_L2G-20050714_standardized_layout.svg.png')
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
# Displaying the original image
plt.imshow(img, cmap='gray')
plt.show()V        
No alt text provided for this image
Grayscale Image

The X gradients :

# X gradient Scharr operato
fieldx = cv2.Scharr(img, cv2.CV_32F, 1, 0) / 15.36
# Displaying output image
plt.imshow(fieldx, cmap='gray')
plt.show()        
No alt text provided for this image
X gradients

The Y gradients :

# Y gradient Scharr operator
fieldy = cv2.Scharr(img, cv2.CV_32F, 0, 1) / 15.36
# Displaying output image
plt.imshow(fieldy, cmap='gray')
plt.show()        
No alt text provided for this image

Conclusion

Through this article, we understood what we mean by convolution filters and image gradient and why it is necessary for image processing. We understood the theory of edge detection in image processing and also learned the formulation of the Sobel and Scharr operator used to compute the gradient of an image. We also implemented these functions to gain a better understanding of these functions.

Cristina Bernardes Monteiro

Physicist, Research Scientist @ Physics Department - University of Coimbra

1 年

How do we get the values for the matrix elements of the outer rows and outer columns? How is the filter applied to those elements and what are the weights for each filter element in those cases?

回复

要查看或添加评论,请登录

Abhishake Yadav的更多文章

社区洞察

其他会员也浏览了