登录查看更多内容

Image Processing: Convolution filters and Calculation of image gradients

Abhishake Yadav

Using data analysis to make decisions, an analytical approach to business leadership

发布日期: 2023年1月18日

Convolution filters are a fundamental building block in image processing and computer vision. They are used to extract specific features from an image by applying a small matrix of numbers, called the kernel or filter, to the image. The result of the convolution is a new image, where each pixel is a weighted sum of the pixels in the original image. The weights are determined by the values in the kernel.

A convolution filter can be thought of as a sliding window that moves across the image, applying the kernel to small regions of the image. The convolution operation is performed by element-wise multiplying the kernel with the region of the image that is currently under the window, and then summing the results. This process is repeated for every location in the image, resulting in a new image where each pixel is the sum of the products of the kernel values and the corresponding pixel values in the original image.

Here is an example of a convolution filter in Python using the popular image processing library OpenCV:

import numpy as np
import cv2


# Load the image
image = cv2.imread("good_boy.jpeg")


# Define the kernel
kernel = np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]])


# Apply the convolution filter
result = cv2.filter2D(image, -1, kernel)


# Display the original and filtered images
cv2.imshow("Original Image", image)
cv2.imshow("Filtered Image", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

In the above example, the convolution filter is applied to the image using the cv2.filter2D() function. The first argument is the image, the second argument is the depth of the output image (-1 means the same depth as the input image), and the third argument is the kernel.

kernel defined in the example is a simple edge detection filter known as the "Laplacian of Gaussian" (LoG) filter, which is used to detect edges in an image by computing the second derivative.

Below we take a look at the result of applying this filter :

No alt text provided for this image — Before applying the edge detection filter

There are many types of convolution filters such as edge detection filters (Sobel, Scharr, Canny), blur filters (Gaussian, Median, Bilateral), and sharpen filters (Laplacian, Unsharp Mask), each with a specific purpose and kernel.

Convolution filters are a powerful technique for extracting features from images and are widely used in various applications such as image enhancement, object detection, and image recognition.

As described above we can broadly divide the main filters into three categories :

Edge detection filters (Sobel, Scharr, Canny)
Blur filters (Gaussian, Median, Bilateral)
Sharpen filters (Laplacian, Unsharp Mask)

In this article we will mainly focus on edge detection.

Theory of edge detection :

1. An edge in an image is represented by a sudden change in the intensity value of the pixel. This change in intensity value can be described by derivatives. The greater is the change in gradient, the larger is the change in the image.

2. In a 1-dimensional image, the edge can be represented by a jump in the intensity value.

3. The jump in the intensity value can be observed more carefully by calculating the first derivative.

4. The edge detection method works on the principle of identifying pixels in the image with a higher gradient value than its neighbours.

In this article we will be discussing Sobel and Scharr filters :

Sobel filters : Sobel filters are used to detect edges in an image. They are based on the Sobel operator, which is a discrete differentiation operator that can be applied to an image to obtain the gradient magnitude of the image. The Sobel operator uses two 3x3 kernels, one for detecting horizontal edges and one for detecting vertical edges. The kernels are typically defined as follows:

# Horizontal Sobel kernel
kernel_x = np.array([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]])


# Vertical Sobel kernel
kernel_y = np.array([[-1, -2, -1], [0, 0, 0], [1, 2, 1]])

To apply the Sobel filter to an image, the image is convolved with these kernels using the convolution operation. The convolution operation is typically implemented as a nested for loop that slides the kernel over the image and multiplies the kernel values with the corresponding pixel values in the image. The result of the convolution is typically a 2D array where each element represents the gradient magnitude of the corresponding pixel in the original image.

Here is some sample python code that demonstrates how to use the Sobel filter to detect edges in an image:

领英推荐

“What’s in Your Fridge?” – Build a Practical Computer…

LandingAI 1 个月前

Geometric Learning in Python: Basics

Patrick Nicolas 1 年前

Computer Vision Roadmap- Step-by-Step Guide

Aqsa Z. 10 个月前

from matplotlib.image import imrea
import matplotlib.pyplot as plt
import numpy as np


# #---------------------------------------------------------------------------------------------------------------------
# PART I - Transforming an image from color to grayscale
# #---------------------------------------------------------------------------------------------------------------------


# Here we import the image file as an array of shape (nx, ny, nz)
image_file = '/Users/abhishake/Desktop/good_boy.jpeg'
input_image = imread(image_file)  # this is the array representation of the input image
[nx, ny, nz] = np.shape(input_image)  # nx: height, ny: width, nz: colors (RGB)


# Extracting each one of the RGB components
r_img, g_img, b_img = input_image[:, :, 0], input_image[:, :, 1], input_image[:, :, 2]


# The following operation will take weights and parameters to convert the color image to grayscale
gamma = 1.400  # a parameter
r_const, g_const, b_const = 0.2126, 0.7152, 0.0722  # weights for the RGB components respectively
grayscale_image = r_const * r_img ** gamma + g_const * g_img ** gamma + b_const * b_img ** gamma


# This command will display the grayscale image alongside the original image
fig1 = plt.figure(1)
ax1, ax2 = fig1.add_subplot(121), fig1.add_subplot(122)
ax1.imshow(input_image)
ax2.imshow(grayscale_image, cmap=plt.get_cmap('gray'))
fig1.show()
plt.show()

# #---------------------------------------------------------------------------------------------------------------------
# PART II - Applying the Sobel operator
# #---------------------------------------------------------------------------------------------------------------------


"""
The kernels Gx and Gy can be thought of as a differential operation in the "input_image" array in the directions x and y 
respectively. These kernels are represented by the following matrices:
      _               _                   _                _
     |                 |                 |                  |
     | 1.0   0.0  -1.0 |                 |  1.0   2.0   1.0 |
Gx = | 2.0   0.0  -2.0 |    and     Gy = |  0.0   0.0   0.0 |
     | 1.0   0.0  -1.0 |                 | -1.0  -2.0  -1.0 |
     |_               _|                 |_                _|
"""


# Here we define the matrices associated with the Sobel filter
Gx = np.array([[1.0, 0.0, -1.0], [2.0, 0.0, -2.0], [1.0, 0.0, -1.0]])
Gy = np.array([[1.0, 2.0, 1.0], [0.0, 0.0, 0.0], [-1.0, -2.0, -1.0]])
[rows, columns] = np.shape(grayscale_image)  # we need to know the shape of the input grayscale image
sobel_filtered_image = np.zeros(shape=(rows, columns))  # initialization of the output image array (all elements are 0)


# Now we "sweep" the image in both x and y directions and compute the output
for i in range(rows - 2):
    for j in range(columns - 2):
        gx = np.sum(np.multiply(Gx, grayscale_image[i:i + 3, j:j + 3]))  # x direction
        gy = np.sum(np.multiply(Gy, grayscale_image[i:i + 3, j:j + 3]))  # y direction
        sobel_filtered_image[i + 1, j + 1] = np.sqrt(gx ** 2 + gy ** 2)  # calculate the "hypotenuse"


# Display the original image and the Sobel filtered image
fig2 = plt.figure(2)
ax1, ax2 = fig2.add_subplot(121), fig2.add_subplot(122)
ax1.imshow(input_image)
ax2.imshow(sobel_filtered_image, cmap=plt.get_cmap('gray'))
fig2.show()
plt.show()

This Python code performs two main tasks:

Transforming an image from color to grayscale:

It imports an image file and converts it into an array representation.
Extracts the Red, Green, and Blue (RGB) components of the image.
Applies weights and a parameter to the RGB components to convert the color image to grayscale.
Displays the original image and the grayscale image side by side.

2. Applying the Sobel operator:

Defines two matrices, Gx and Gy, associated with the Sobel filter.
Initializes an output image array with all elements set to 0.
"Sweeps" the grayscale image in both the x and y directions and applies the Sobel operator to compute the output.
Displays the original grayscale image and the Sobel filtered image side by side.

2. Scharr Filters: The Scharr operator is used as a method to identify and highlight gradient edges or features of an image using the 1st derivative. It is commonly used to identify gradients along the x-axis (dx = 1, dy = 0) and y-axis (dx = 0, dy = 1). The performance of the Scharr operator is quite similar to the Sobel operator.

The Scharr operator is an enhancement of the difference between the Sobel operator, and the two are the same as the principle of the edge of the image. It?increases the difference between the pixel value by amplifying the weight coefficients.

The Scharr gradient for an image can be calculated in different directions depending on the value of the dx and dy parameters

1.?X-direction Scharr derivative: The Scharr derivative is computed in the X-direction by setting the value for the x derivative as 1 and the value for y derivative as 0. Therefore, dx will be equal to 1 and dy will be equal to 0.

2.?Y-direction Scharr derivative: The Scharr derivative is computed in the Y-direction by setting the value for the x derivative as 0 and the value for y derivative as 1. Therefore, dx will be equal to 0 and dy will be equal to 1.

3.?X and Y direction Scharr derivative: The Scharr derivative cannot be computed for both X and Y directions simultaneously.

# Importing OpenCV
import cv2
# Importing numpy
import numpy as np
# Importing matplotlib.pyplot
import matplotlib.pyplot as plt
# Reading the image
img = cv2.imread(r'/Users/abhishake/Desktop/Sudoku_Puzzle_by_L2G-20050714_standardized_layout.svg.png')
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
# Displaying the original image
plt.imshow(img, cmap='gray')
plt.show()V

The X gradients :

# X gradient Scharr operato
fieldx = cv2.Scharr(img, cv2.CV_32F, 1, 0) / 15.36
# Displaying output image
plt.imshow(fieldx, cmap='gray')
plt.show()

The Y gradients :

# Y gradient Scharr operator
fieldy = cv2.Scharr(img, cv2.CV_32F, 0, 1) / 15.36
# Displaying output image
plt.imshow(fieldy, cmap='gray')
plt.show()

Conclusion

Through this article, we understood what we mean by convolution filters and image gradient and why it is necessary for image processing. We understood the theory of edge detection in image processing and also learned the formulation of the Sobel and Scharr operator used to compute the gradient of an image. We also implemented these functions to gain a better understanding of these functions.

Cristina Bernardes Monteiro

Physicist, Research Scientist @ Physics Department - University of Coimbra

1 年

How do we get the values for the matrix elements of the outer rows and outer columns? How is the filter applied to those elements and what are the weights for each filter element in those cases?

要查看或添加评论，请登录

Abhishake Yadav的更多文章

SAP and Databricks: The Game-Changing Partnership Shaping the Future of Enterprise Data and AI

2025年2月26日

SAP and Databricks: The Game-Changing Partnership Shaping the Future of Enterprise Data and AI

Introduction If there’s one thing that virtually every digital leader craves today, it’s the ability to unify their…

2 条评论
deepseek : From PPO to GRPO, Transforming RL Fine-Tuning for Large Language Models

2025年1月28日

deepseek : From PPO to GRPO, Transforming RL Fine-Tuning for Large Language Models

When it comes to Reinforcement Learning (RL) for large language models (LLMs), Proximal Policy Optimization (PPO) has…

1 条评论
Transforming Customer Support with Retrieval-Augmented Generation (RAG) on SAP BTP

2024年6月16日

Transforming Customer Support with Retrieval-Augmented Generation (RAG) on SAP BTP

In today's fast-paced business environment, providing exceptional customer support is more critical than ever…

2 条评论
Unlocking the Power of Data Storytelling for SAP Professionals: A Comprehensive Guide

2024年6月16日

Unlocking the Power of Data Storytelling for SAP Professionals: A Comprehensive Guide

In the ever-evolving landscape of data science, one principle remains timeless: the art of storytelling. We have all…

2 条评论
Unleashing the Dark Side of AI: Safeguarding Your Digital Fortress Against Cybercrime

2023年5月2日

Unleashing the Dark Side of AI: Safeguarding Your Digital Fortress Against Cybercrime

As the amount of cybercrime continues to increase, it is essential to evaluate and manage risk at the scale and…
Revolutionising Education: Tackling the 2 Sigma Problem

2023年5月2日

Revolutionising Education: Tackling the 2 Sigma Problem

In recent times, the potential impacts of artificial intelligence (AI) on various aspects of society have been a hot…

1 条评论
The Paradox of AI: Brilliant and Clumsy at the Same Time

2023年5月2日

The Paradox of AI: Brilliant and Clumsy at the Same Time

Artificial Intelligence (AI) has come a long way since its inception. Today, AI models can be as large as Goliath and…
KL Divergence , an intuitive and practical description

2023年1月17日

KL Divergence , an intuitive and practical description

KL Divergence, also known as Kullback-Leibler divergence, is a measure of the difference between two probability…
Efficient QC solutions for Seismic Source vessels

2021年5月2日

Efficient QC solutions for Seismic Source vessels

The marine seismic market is one of the hardest-hit sectors in the downturn. While the offshore industry is gradually…

6 条评论

See all articles

Image Processing: Convolution filters and Calculation of image gradients

Abhishake Yadav

Using data analysis to make decisions, an analytical approach to business leadership

领英推荐

Conclusion

Abhishake Yadav的更多文章

社区洞察

其他会员也浏览了

Modular GANs with Neural Blocks in Python

How to Write an Algorithm?

How to build Image Classifier from scratch using Python and TensorFlow

MLBP 8: Uber AI Open Sources Pyro- Probabilistic Deep Learning in Python

Unleashing the Power of Stable Diffusion: Build and Train Your Own Model with Python and PyTorch

Building powerful image classification models using very little data

Pytorch

Guide to Image Processing with C [1]

Implementing LSTM with TensorFlow and Python

Facial recognition with Computer Vision

领英推荐

Conclusion

Abhishake Yadav的更多文章

SAP and Databricks: The Game-Changing Partnership Shaping the Future of Enterprise Data and AI

deepseek : From PPO to GRPO, Transforming RL Fine-Tuning for Large Language Models

Transforming Customer Support with Retrieval-Augmented Generation (RAG) on SAP BTP

Unlocking the Power of Data Storytelling for SAP Professionals: A Comprehensive Guide

Unleashing the Dark Side of AI: Safeguarding Your Digital Fortress Against Cybercrime

Revolutionising Education: Tackling the 2 Sigma Problem

The Paradox of AI: Brilliant and Clumsy at the Same Time

KL Divergence , an intuitive and practical description

Efficient QC solutions for Seismic Source vessels

社区洞察

其他会员也浏览了

Modular GANs with Neural Blocks in Python

How to Write an Algorithm?

How to build Image Classifier from scratch using Python and TensorFlow

MLBP 8: Uber AI Open Sources Pyro- Probabilistic Deep Learning in Python

Unleashing the Power of Stable Diffusion: Build and Train Your Own Model with Python and PyTorch

Building powerful image classification models using very little data

Pytorch

Guide to Image Processing with C [1]

Implementing LSTM with TensorFlow and Python

Facial recognition with Computer Vision