登录查看更多内容

Teaching computers how to see like humans with Convolution Neural Networks and Deep Learning

Manuj Aggarwal

Top Voice in AI | Helping SMBs Scale with AI & Automation | CIO at TetraNoodle | AI Speaker & Author | 4x AI Patents | Travel Lover??

发布日期: 2017年12月14日

Google’s vast index had more than a trillion images some years ago. This number can’t possibly go down owing to the ever-increasing use of internet and use of pictures in conveying the message. But that was just the amount of images indexed by Google, there is also a significant portion of images found on the internet that aren’t indexed. Similar is the case with videos, with over 1 billion videos on YouTube alone it is a massive amount of data in the form of videos.

Videos and images are not only found on the internet. There are security cameras capturing videos of their area of effect at all times. Pictures and videos are being shot for personal and business use. They also make up for a hefty amount of data.

This vast amount of data is way beyond the scope of usual human intervention. This information needs to be processed through automation. Automation, machine learning, computer vision are some techniques using which we can actually have a chance to decipher this data and make sense of it. Digital image processing is the solution for this issue. It involves the use of computational methods for differentiating between the images and videos.

Digital image processing makes this data of images and videos much more viable for analysis purposes. A widespread use of image processing is employed by law enforcement agencies. They compare the facial features of the culprit with the databases of their own and the video inputs coming from around the country with the help of image processing. Additionally, the fingerprint software used by the police and other agencies also employ the use of image processing in its working.

Neural Networks:

Computers or machines do not have same concepts about vision and interpreting images and videos like humans and animals do. Our mind features an insight learning which allows us to make sense of the visual sensory input in a much faster and accurate manner than a computer. We can infer whether the given object is a cat, a dog or a chair irrespective of the color and type. The machine takes time to produce this result as it first makes a digital interpretation of the image and compares it with the images and objects placed in its database.

A new advanced field of machine learning has given rise to the technique of neural networks, which is a machine learning method, is used in processing the image. This method lends true learning capabilities to the machine which brings it closer to the working of the human brain. Human brain receives information from sensory organs, analyzes it and generates a response. Similarly, in the neural networks, there is an input node, a hidden node, and an output node. Information is received on the input node, the hidden node analyzes it, adds value to it and sends it to an output node which produces the action suggested.

Convolution Neural Networks:

Convolution neural networks is a modified version of the neural network. The convolution neural network gathers as many inputs as possible which means we should be able to capture most if not all the pixels in an image and process for further analysis. Makes sense, right?

Well not really. Believe it or not even with the large computers and multiple core CPUs and GPUs we have at our disposal these days - this is not practically possible because of constraints of time and processing power.

Convolution neural network decreases the sample size used in the analysis. An image usually contains hundreds of thousands or even millions of pixels. With the current computing capacity we have at our disposal - it is not viable to analyze each and every pixel. It is also a fact that pixels which are close to each other are very similar to each other though vary as the distance increases. So the sample size is decreased by making groups of pixels. One pixel, a representative, is selected from each group and put up with other such pixels. These pixels are then analyzed with ease as they are small in number and can be efficiently computed.

Steps:

The image is divided into smaller parts in the form of tiles.
These are sent to a neural network which converts them from a network of tiles to arrays.
The arrays represent the area of picture numerically. They are also assigned three axes of color, height, and width of the channel which are the three dimensions. A fourth dimension of time is also assigned if the input is, in fact, a video.
The multidimensional arrays are then exposed to a downsampling function which removes the unnecessary and redundant information from these set of arrays so that only the required amount is analyzed.
This data is then forwarded to the conventional neural network.
The computations are done pretty quickly, and the desired output is generated in the form of labels.

The first three steps are also termed as convolution. And from that name of this neural network comes from. There are many convolution and downsampling layers in a real system. They all are working at the same time to reduce the sample to a manageable size.

Why is it better?

Convolution neural network is better than the conventional image processing method due to a variety of reasons. Conventional image processing methods include: Conversion to greyscale of images and then comparing the pixels, comparing the pixels one by one, Scale-Invariant Feature Transform (SIFT), Binary Robust Independent Elementary Features (BRIEF) and Speeded-Up Robust Features (SURF) etc. These processing methods are either totally dying out or are inefficient when compared to convolution neural networks. Convolution neural network is much more efficient as it uses less time and fewer resources. Additionally, SIFT and SURF are somewhat comparable to convolution neural network, but SIFT and SURF offer face problems of inaccuracies. SIFT and SURF use Gaussian differences and differences of scales for detecting the object and often compromise on the data for simplifying the computations which essential for identification of an object. On the other hand, convolution neural networks though employ simplification of the image, but they make sure that no important detail is lost in the process.

Convolution neural systems are now being used around the world. Neural systems are inspired by and are derived from the human brain. When used for image processing, it sorts of gives the computer a human-like vision. It is due to this reason that almost all the face detectors software is using convolution neural networks for accuracy. It also uses so few resources that even your smartphones are using this for face detection.

This technology is relatively new and yet it has endless advantages over the traditional imaging recognition. It is already being used for law enforcement, security, health, and much more. Only time will tell how more can we benefit from it.

About

Manuj Aggarwal is an entrepreneur, investor and a technology enthusiast who likes startups, business ideas, and high-tech anything. He enjoys working on hard problems and getting his hands dirty with cutting-edge technologies. In a career spanning two decades, he has been a business owner, technical architect, CTO, coder, startup consultant, and more.

要查看或添加评论，请登录

Manuj Aggarwal的更多文章

Create your own AI buddy today!

2025年3月21日

Create your own AI buddy today!

Will you ever drink out of a public pool? Would you pour crude oil into your car? No? Then why are you using public AI?…
80% faster results with AI you actually control

2025年3月14日

80% faster results with AI you actually control

By now, we know that AI is changing how business works. For good.

2 条评论
How smart businesses reach 500+ customers in one click

2025年3月7日

How smart businesses reach 500+ customers in one click

Imagine you need to send an urgent email to 500 customers. Maybe it’s a discount code.

3 条评论
Smart businesses don’t waste time on AI tools - They use AI right

2025年2月28日

Smart businesses don’t waste time on AI tools - They use AI right

Every day, business owners ask me: "What’s the best AI tool for contracts?" "Should I use this software or that…

4 条评论
AI is not about tools – It’s about strategy.

2025年2月21日

AI is not about tools – It’s about strategy.

Every day, a new AI tool pops up, promising to “revolutionize” your business. But let’s be honest.

3 条评论
Boost your team's value with AI and automation - Reduce costs by up to 60%

2025年2月14日

Boost your team's value with AI and automation - Reduce costs by up to 60%

Businesses today are drowning in repetitive tasks. Data entry, document processing, customer inquiries, compliance…

6 条评论
How AI stopped a fast-growing company from losing key deals

2025年2月7日

How AI stopped a fast-growing company from losing key deals

The email that slipped away and the deal that went with it Ever been in a situation where you just needed to send an…
84% of CEOs say AI will transform business - Why haven’t you started yet?

2025年1月31日

84% of CEOs say AI will transform business - Why haven’t you started yet?

AI challenges in 2025 – What you need to know (and do!) As we step into 2025, AI is no longer a luxury, it’s a…

3 条评论
Cut hiring costs by up to 80%: The game-changing strategy you need to know!

2025年1月24日

Cut hiring costs by up to 80%: The game-changing strategy you need to know!

Are manual processes slowing you and your team down? Taking up hours of your precious time that could have been spent…
AI adoption surges to 72% - is your business keeping up?

2025年1月17日

AI adoption surges to 72% - is your business keeping up?

There’s a lot of noise about AI right now, isn’t there? While the hype is real, it’s easy to feel pressured into…

6 条评论

See all articles

Teaching computers how to see like humans with Convolution Neural Networks and Deep Learning

Manuj Aggarwal

Top Voice in AI | Helping SMBs Scale with AI & Automation | CIO at TetraNoodle | AI Speaker & Author | 4x AI Patents | Travel Lover??

About

Manuj Aggarwal的更多文章

社区洞察

其他会员也浏览了

DEEP LEARNING BASED OJECT RECOGNITION SYTEM: Analyzing The Effect Of The Learning Rate In A Convolutional Neural Network

The Only Deep Learning Guide You Need for Object Detection Labeling

What's the basis of modern Deep Learning Models?

Deep Learning Demystified: Understanding Neural Networks and Their Applications

The Most Commonly Used Machine Learning Techniques, Explained

3 real world deep learning projects

Deep Learning- Diving into Unexplored Depths!

Teaching Neural Networks to Talk Like Painters Paint

Layers of Learning – How Neural Networks Whisper Their Way to Accuracy

Recurrent Neural Networks in Machine Learning

About

Manuj Aggarwal的更多文章

Create your own AI buddy today!

80% faster results with AI you actually control

How smart businesses reach 500+ customers in one click

Smart businesses don’t waste time on AI tools - They use AI right

AI is not about tools – It’s about strategy.

Boost your team's value with AI and automation - Reduce costs by up to 60%

How AI stopped a fast-growing company from losing key deals

84% of CEOs say AI will transform business - Why haven’t you started yet?

Cut hiring costs by up to 80%: The game-changing strategy you need to know!

AI adoption surges to 72% - is your business keeping up?

社区洞察

其他会员也浏览了

DEEP LEARNING BASED OJECT RECOGNITION SYTEM: Analyzing The Effect Of The Learning Rate In A Convolutional Neural Network

The Only Deep Learning Guide You Need for Object Detection Labeling

What's the basis of modern Deep Learning Models?

Deep Learning Demystified: Understanding Neural Networks and Their Applications

The Most Commonly Used Machine Learning Techniques, Explained

3 real world deep learning projects

Deep Learning- Diving into Unexplored Depths!

Teaching Neural Networks to Talk Like Painters Paint

Layers of Learning – How Neural Networks Whisper Their Way to Accuracy

Recurrent Neural Networks in Machine Learning