登录查看更多内容

Rotation Invariance in Neural Nets

Manny Ko

发布日期: 2019年8月9日

A recent CVPR paper "Strike (with) a Pose:" very elegantly and forcefully demonstrate the importance of invariance in deep learning models. Column 1 is the classification accuracy of a recent CNN based ImageNet classifier (Inception). One can see it does a great job. However, it is very bad with images that looks closely related to our eyes but are rotated. This is primarily due to lack of rotation invariance and other invariance in modern CNN. As one know from signal processing, convolution give us translate invariance. Scale invariance (think how far the object is from the camera) is partially handled by down-sampling.

The shocking fact reported by this study is how bad a very modern CNN is handling scale issues -

Q: "We found a change in rotation as small as 8.02? can cause an object to be misclassified. Along the spatial dimensions, a translation resulting in the object moving as few as 2 pixels horizontally or 4.5 px vertically also caused the DNN to misclassify. Lastly, along the z-axis, a change in “size” (i.e., the area of the object’s bounding box) of only 5.4% can cause an object to be misclassified."

Part of the issue is in the data-augmentation pipeline used by these models during training. Typically random cropping, mirror is used. Perhaps some do some randomized intensity sampling. Basically none of them tries to do any scaling or rotation. One might ask 'why not?". The simple sad fact is because they do not help if you all you want is to win ImageNet competition!!

Some conscientious researchers are trying to tackle these problems. However, they are a distinct minority:

"Learning rotation invariance in deep hierarchies using circular symmetric filters" Kohli et al ICASSP
"TI-POOLING: transformation-invariant pooling for feature learning in Convolutional Neural Network" Laptev et al arXiv: 1604.06318 2016
"Oriented Response Networks" Zhou et al. CVPR 2017.

We should support them because they are fighting a lonely war.

Rob Markovic

Inventor, Technology Counsel, Sigma Male, Discerning

5 年

If we had truly, perfectly explainable AI, this would be much more obvious and we could move on to different network model types that don't have these serious issues.

Jimmy Jose

ML Architect| NLP | Deep Learning | Generative AI | Data Scientist | OCR | Computer Vision| RAG| LLM

5 年

More courses need to come on data augmentation but it's not really sexy. It's a bit scary that a lot of these models could be deployed by people who are not aware that there could arise an issue like this. Thanks for sharing this :)

Matt Challacombe

Quantum. Biochemistry. Applied Maths & HPC. Science as crypto-asset.

5 年

See also failures in Quantum Chemistry around locally spherical grids, and assumed levels of rotational invariance massively violated:? https://cen.acs.org/physical-chemistry/computational-chemistry/Density-functional-theory-error-discovered/97/web/2019/07

2 次回应

Manny Ko

5 年

It is not the guideline as such. It is because the rules (objective) of ImageNet is such that the score will not be improved by those augmentations. Which point out the artificial nature of these sort of competitions.? This also applies to most Kaggle.

1 次回应

查看更多评论

要查看或添加评论，请登录

Manny Ko的更多文章

Improved Enum for Python

2021年12月21日

Improved Enum for Python

Sample code to demonstrate how to use some of added methods in Enumbase to write a command line declaration that is…

8 条评论
Claude Shannon

2020年12月27日

Claude Shannon

Well he only invented the whole 'entropy' thing. Started the field of Information Theory (Mutual-information…

14 条评论
Accurate and fast PI in Python

2020年2月3日

Accurate and fast PI in Python

In my last article I show how to use numpy and numba to speed up a naive Monte Carlo method to compute PI. We managed…

1 条评论
Numpy+numba for 20x in Python

2020年2月3日

Numpy+numba for 20x in Python

Method 1: Naive Monte-Carlo rejection sampling to compute PI - C/C++ style. I am using 10 million MC samples…

24 条评论
6th method for Pandas: swifter

2019年12月10日

6th method for Pandas: swifter

In my previous Pandas article I show 5 different ways to apply a heaverside() function to a small dataset of NY hotel…

6 条评论
Pandas: 5 Very Different Performances, more than 2000x

2019年12月7日

Pandas: 5 Very Different Performances, more than 2000x

Our dataset consists of 1631 records for locations of hotels in New York. The above is the output from 5 different ways…

12 条评论
Deep Image Prior

2019年1月21日

Deep Image Prior

We might be aware of recent amazing results of generative-networks being able to upsample a very low quality/low-res…

1 条评论

See all articles

Rotation Invariance in Neural Nets

Manny Ko

Some conscientious researchers are trying to tackle these problems. However, they are a distinct minority:

Manny Ko的更多文章

社区洞察

其他会员也浏览了

Diverting Our Attention Once Again: A Look at Mamba

How KANs Rethink AI Problem-Solving

From CNNs to ControlNet: Bridging Theory and Practice in AI-Powered Image Processing

BxD Primer Series: Deep Q-Network (DQN) Reinforcement Learning Models

AI Atlas #16: Convolutional Neural Networks (CNNs)

The backpropagation AI algorithm: The best ally and the best enemy of deep neural network learning!

The Evolution of the YOLO Neural Network Family: From v1 to v8 (Part 3 of 3)

?Kessler Test? (1, 1A, 1B) – a short cut to an advanced performance indicator for artificial intelligence (A.I.)

Unraveling the Power of Vision: A Deep Dive into the Different Types of CNNs

Some conscientious researchers are trying to tackle these problems. However, they are a distinct minority:

Manny Ko的更多文章

Improved Enum for Python

Claude Shannon

Accurate and fast PI in Python

Numpy+numba for 20x in Python

6th method for Pandas: swifter

Pandas: 5 Very Different Performances, more than 2000x

Deep Image Prior

社区洞察

其他会员也浏览了

Diverting Our Attention Once Again: A Look at Mamba

How KANs Rethink AI Problem-Solving

From CNNs to ControlNet: Bridging Theory and Practice in AI-Powered Image Processing

BxD Primer Series: Deep Q-Network (DQN) Reinforcement Learning Models

AI Atlas #16: Convolutional Neural Networks (CNNs)

The backpropagation AI algorithm: The best ally and the best enemy of deep neural network learning!

The Evolution of the YOLO Neural Network Family: From v1 to v8 (Part 3 of 3)

?Kessler Test? (1, 1A, 1B) – a short cut to an advanced performance indicator for artificial intelligence (A.I.)

Unraveling the Power of Vision: A Deep Dive into the Different Types of CNNs