登录查看更多内容

First-principles approach to feature/blob Detection using SIFT

Krishna Yogi Kolluru

Data Scientist | ML Architect | GenAI | Sagemaker | Speaker | ex-Microsoft | IIT - NUS Alumni | AWS Certified ML / Data Engineer

发布日期: 2021年5月8日

Image detection aka feature detection is hard for computers like very very hard in fact one can say impossibly hard as computers can only reach a probabilistic outcome and humans somehow have an intuitive understanding of pictures (even though a lot of it involves heavy neural computations too if you dig deeper ).

To be fair, it's humans who label images and it's the computer that needs to learn from that so computers always have to catch up!

With this disclaimer in place, let's talk about SIFT - Scale-invariant Feature Transform, A magical technique that can extract features from images, compare them, align them correctly if needed and even do image stitching.

The reason why it's magical is that computers only speak binary ( 0 or 1 ) and simple maths (addition/subtraction/division/multiplication) and images are simply made of numbers too ( when stored on a computer). so computers somehow need to make sense of the numbers in these images and understand features that are relevant to us humans ( coz those are irrelevant to computers :) )

This is an area of research that took more than 2 decades by some of the best research scientists in the world.

The goal is to compare similar features between two similar pictures that are not identical.

There could be a variation of scale, there could be rotation in images, there could be a difference in lighting, etc.

When two images are fed to a SIFT algorithm, it first identifies what are called Blobs or Areas of interest at different scales ( this is important to take care of scale ). The second step is to identify an alignment amongst the pictures i.e a sense of direction/alignment and then scale factor, then finally we can compare these blobs/areas of interest and see if these match, then and only then are we done with the feature matching/image detection and comparison.

Now let's review the math of all the steps

First step: Identifying Blobs at various scales.

领英推荐

Early AI Systems and Programs: The Pioneering 1960s

Ferhat SARIKAYA 4 个月前

Understanding YOLO: Real-Time Object Detection for…

Sameer Navaratna 1 个月前

Computer Vision in a nutshell

Joseph Carmignani 4 年前

For this step, we use 2-d Gaussian filter smoothing across entire images and extract Normalised Log gaussian values across different sigmas ( sigma is the value of guassian variance ), it so happens that, these Gaussians are actually pretty useful in identifying blobs at different scales. some blobs can be identified at 1 sigma, others at n sigma, and so on.

There is an interesting feature with the Gaussian Filters, for the same blob on two images at different scales, the sigma ( where the peak appears ) happens to be proportional to the scale which can be used to identify the same blobs amongst two pictures, In this respect, the gaussian filters feel like god-send :)

The next step is identifying orientation:

For identifying orientation ( i.e a picture could be tilted compared to the other), we employ edge detection techniques like 'Canny Edge detection' for each pixel in the entire picture, after we identifying edges at the pixel level, we plot a histogram of directions ( roughly 8 different directions vs pixel count. The direction with the largest count will be considered the orientation of the blob.

Note: even though this technique might not truly identify the orientation of a blob, this is good enough for comparing the orientations of two different pictures assuming they are the same.

The final step is 'similar blob identification' (after correcting for scale and orientation )

For correct blob identification ( in an automatic fashion, without human intervention of course ), we need to extract some sort of unique signature for each blob. Turns out , we can use the histogram of the directions ( that we created previously) as signature as well.

This time, however, we plot the direction histogram of the full blob ( all 4 quadrants of it ) in a single chart. we shall be considering the entire direction histogram as the signature of the blob. Do note that this signature function can only be used after aligning the two pictures correctly ( for which the previous step is useful )

So the SIFT algorithm has delivered what we intended it to do, it has helped us in identifying interesting features, then helped to identify the orientation, and finally identified the exact blobs on different pictures.

Thanks for reading!

要查看或添加评论，请登录

Krishna Yogi Kolluru的更多文章

Mastering Spark SQL Functions: A Comprehensive Guide

2024年9月2日

Mastering Spark SQL Functions: A Comprehensive Guide

Apache Spark SQL provides a rich set of functions to handle various data operations. This guide covers essential Spark…
100 Data Engineering Jargon That You Must Know

2024年8月27日

100 Data Engineering Jargon That You Must Know

Data engineering is at the heart of how businesses collect, process, and use data to make informed decisions. As the…

3 条评论
Slowly Changing Dimensions in Data Warehouses

2024年8月17日

Slowly Changing Dimensions in Data Warehouses

What is a Data Warehouse? A data warehouse is a centralized repository where data from different sources is stored. It…
VectorDB Tutorial — A Beginner’s Guide

2024年7月27日

VectorDB Tutorial — A Beginner’s Guide

A Vector Database (VectorDB) is designed to store and manage vector data, often used in machine learning and AI…
Databricks SQL Series — Part 5 — Managing and Securing Your Data

2024年7月26日

Databricks SQL Series — Part 5 — Managing and Securing Your Data

Synopsis Introduction to Data Management in Databricks Introduction to Data Management in Databricks Data management…
Databricks SQL Series: Integrating Databricks SQL with Visualization Tools — Part 4

2024年7月26日

Databricks SQL Series: Integrating Databricks SQL with Visualization Tools — Part 4

A Detailed Guide on working with visualization tools Synopsis Introduction In part 3, we saw about using Windows…
Databricks SQL Series: Advanced Analytics in Databricks SQL — Using Window Functions — Part 3

2024年7月25日

Databricks SQL Series: Advanced Analytics in Databricks SQL — Using Window Functions — Part 3

A Detailed Guide on Window Functions Synopsis Introduction Window functions in Databricks SQL are used for performing…
Databricks SQL Series — Optimizing Data Queries with Databricks SQL — Part 2

2024年7月25日

Databricks SQL Series — Optimizing Data Queries with Databricks SQL — Part 2

Synopsis Understanding the Basics of Query Optimization Welcome to the second part of our Databricks SQL Series, where…
Databricks SQL Series — Introduction to Databricks SQL — Part 1

2024年7月24日

Databricks SQL Series — Introduction to Databricks SQL — Part 1

Synopsis What is Databricks SQL? Are you a professional looking to master Databricks SQL practically? Look no further!…

2 条评论
Delta Live Tables — Part 5— Exploring Advanced Features and Optimization Techniques in Delta Live Tables

2024年7月22日

Delta Live Tables — Part 5— Exploring Advanced Features and Optimization Techniques in Delta Live Tables

As we learnt about the architecture, step-by-step process and data process management in the previous blogs of the…

See all articles

First-principles approach to feature/blob Detection using SIFT

Krishna Yogi Kolluru

Data Scientist | ML Architect | GenAI | Sagemaker | Speaker | ex-Microsoft | IIT - NUS Alumni | AWS Certified ML / Data Engineer

领英推荐

Krishna Yogi Kolluru的更多文章

社区洞察

其他会员也浏览了

Visualizing Neural Network Predictions on MNIST Dataset Using Tensorflow in Google Colab

COMPLEMENT A CONVOLUTION WITH POOLING

Why do LLMs hallucinate?

Beyond Keywords: Vector Intelligence

Implementation from Scratch: Forward and Back Propagation of a Pooling Layer

Machine Learning

A Neural Network In Under 4KB

Problem of Vanishing and Exploding Gradients.

Exploring Image Classification with TensorFlow and the MNIST Dataset

Neural Network Does Symbolic Reasoning in Advanced Math!

领英推荐

Krishna Yogi Kolluru的更多文章

Mastering Spark SQL Functions: A Comprehensive Guide

100 Data Engineering Jargon That You Must Know

Slowly Changing Dimensions in Data Warehouses

VectorDB Tutorial — A Beginner’s Guide

Databricks SQL Series — Part 5 — Managing and Securing Your Data

Databricks SQL Series: Integrating Databricks SQL with Visualization Tools — Part 4

Databricks SQL Series: Advanced Analytics in Databricks SQL — Using Window Functions — Part 3

Databricks SQL Series — Optimizing Data Queries with Databricks SQL — Part 2

Databricks SQL Series — Introduction to Databricks SQL — Part 1

Delta Live Tables — Part 5— Exploring Advanced Features and Optimization Techniques in Delta Live Tables

社区洞察

其他会员也浏览了

Visualizing Neural Network Predictions on MNIST Dataset Using Tensorflow in Google Colab

COMPLEMENT A CONVOLUTION WITH POOLING

Why do LLMs hallucinate?

Beyond Keywords: Vector Intelligence

Implementation from Scratch: Forward and Back Propagation of a Pooling Layer

Machine Learning

A Neural Network In Under 4KB

Problem of Vanishing and Exploding Gradients.

Exploring Image Classification with TensorFlow and the MNIST Dataset

Neural Network Does Symbolic Reasoning in Advanced Math!