登录查看更多内容

Computer Vision & Data Annotation - An easy way of understanding the relevance in the real world

Rafael Oliveira

Principal Solutions Architect at iMerit Technology | Product Manager | Autonomous Mobility | Computer Vision | GenAI

发布日期: 2021年11月7日

Here, you will find a brief explanation of computer vision, some cases we are experiencing in real life and some of the existent techniques in data annotation supporting the advance of computer vision. I want to highlight upfront that I'm not approaching any computer vision algorithms in this post. My main goal is to support and bring knowledge to people with little or no understanding of CV and data annotation.

What is computer vision?

Computer vision is a field in Artificial Intelligence that enables computers to understand images, videos, and digital assets, through the understanding of those visual content computers can take decisions based on a set of predetermined rules.

Computer vision has been and will likely be one of the hottest fields in AI for the next few years. Many of the recent advances in healthcare, security, retail, agriculture, and many other fields are tied to how computers are gaining a higher level of understanding of digital images and videos.

That is cool, right? How does it translate to the real world?

Some good and simple examples of computer vision we are experiencing in our lives and will experience in a few years are

Autonomous vehicles

One of the fields with heavy investment in the past years is a huge user of computer vision techniques. When you see a Tesla car or in the future when you see many other autonomous vehicles driving around that is among other beautiful technologies a sharp computer vision system detecting, classifying, and processing thousands of images in real-time and sending them to the main system for a decision and action.

If you want to dig deep into autonomous vehicles technologies and how computer vision impacts them, I recommend checking here

Motion Controlled Games

Have you ever played Kinect, Nintendo Wii, or any other motion-controlled game?

The camera on those devices tracks the movements in a 3d space and the computer vision system processes it, recognizing human joints and positioning you in an equivalent place in the game and replicating every move in real-time.

Healthcare

The advance of computer vision has helped doctors with the analysis and diagnosis of conditions. Especially looking at static imagery processing and classification, medical software systems impact radiology, pathology, and others.

Among many other things, computer vision helps medical systems to classify images identifying if, for example, the image in question is cancer or a false positive.

You can read more details about computer vision in healthcare here

Many other cases are out there, most of the enterprises are using or will be using computer vision in the following years. A lot of developments have been made in the past years and a lot is yet to come and the impact in the real world will be huge!

Data Annotation

Ok, we saw a few cases of computer vision but how does it happen? Well, there are a lot of techniques involved and in general, those systems are everything but simple. That said, I won’t navigate through the machine learning algorithms, frameworks, and other technical pieces of it. Let's take the first step and understand how these systems get inputs to start learning and “seeing” things as they are supposed to be.

That is where the Data Annotation comes to the game. What is it?

Well, remember all these computer vision systems we spoke about before? They all need a good amount of data with the patterns they need to identify, with correct and incorrect answers for the situation they will face. Like the example of cancer or not in a medical image.

Data annotation is the technique of collecting all the raw data and identifying and labeling the objects to give it a mean and turn it recognizable for a machine learning algorithm, in other simple words is providing these algorithms with similar situations they will face when running in the real-life and what are the correct answers for each one of those. Nowadays you will find automated data annotation methods and also companies such as iMerit that provide among other services data Annotation workforce with expertise in many industries.

Let’s take a look at the most common techniques and some examples.

领英推荐

Is Autonomous AI the Next Transformational Leap in…

Charles Skamser 3 周前

Applications of Multi-Modal AI

Analytics Insight? 8 个月前

The Future of Artificial Intelligence: The Rise of…

Prof. Ahmed Banafa 9 个月前

Image classification

This technique consists of analyzing a set of images and classifying each one of them in a category, the group of categories is pre-determined and changes accordingly to the system you are building.?

Imagine you are trying to develop a system that can read images of animals and first of all identify what animal is in each picture, your categories would be a cat, dog, horse, lion, etc. Sounds very simple right? Yes, but it also has its challenges.

How would you define and classify if suddenly a picture has no animals, or more than one, or the image is pixelated and you can’t identify the animal, or it is half an animal in the picture? Those are all edge cases that the annotator faces, and they need to be defined for the right data to be provided. Usually whoever is developing that system will be able to support and choose what action to take in each case.

Object detection

As the name suggests, this technique identifies specific and multiple objects in an image, video, etc. Once identified, the annotator draws and limits that object in an area, usually using a bounding box (as the example in the image below) or a polygon, finally that area is labeled accordingly.

Face recognition and video surveillance systems heavily rely on object detection.

Semantic segmentation

This is probably one of the hottest, especially because of its large utilization in autonomous vehicles and healthcare.?

Like object detection, this technique also will limit objects in a scene. Although, instead of relying on a square, rectangle or polygon, it will analyze each pixel. Easy to realize that this technique requires much more time and effort from an annotator to define the boundaries.

Using semantic segmentation, all pixels that belong to the same category will receive the same pixel value during the annotation. As you can see in the example below data annotation tools use different colors to facilitate the annotation process and understanding.

Annotating using semantic segmentation is complex since many times the scene will contain hundreds of objects and categories and some systems require a very high precision when defining the boundaries of each object.

Instance segmentation

I mentioned before this method is, many times, confused with semantic segmentation and this happens because the boundaries and objects are mapped at a pixel level as well. The difference is the value you attribute to each pixel.

Remember I said in semantic segmentation every pixel of the same category receives the same value and we saw the example with multiple cars with the same color and label? Here, every single object will receive a different value even if they belong to the same category. This technique is useful if you have a system for example that needs to count the number of objects in a scene or image

Remember, I just listed a few to bring you some understanding but many other methods exist such as object tracking, pattern detection, lidar annotations and others.

Here we finish, I hope this gives you a good overview of computer vision, data annotation and some of the most known techniques.

About the author

I’m a Solutions Architect at iMerit helping clients in different industries to develop their solutions using Data Annotation techniques. I’m a Bachelor’s in Computer Science at the Universidade Estadual Paulista (UNESP), and a master’s degree in IT Management at Fundacao Getulio Vargas (FGV) both in Sao Paulo, Brazil and MBA at Hult International Business School, San Francisco.

https://www.dhirubhai.net/in/rcdoliveira/

My interview with the AI Time Journal - How to build a successful career in Data Science

About iMerit

iMerit unites the technologies, processes, and people that, together, deliver accurate and nuanced annotations that companies depend on to get to production. Our data annotation fuels your journey from exploratory R&D to proof of concept to mission-critical, production-ready solutions. We recognize that your data training process is iterative and evolving and we are always as agile and flexible as you need us to be.

iMerit ML DataOps Summit

Karla Lopes

Associate Director of Admissions at St. George's University

3 年

Very interesting and insightful Rafael Oliveira! Thank you for sharing your insights with us.

1 次回应

ishteyaqahmed Quadri

Animation Film Design ,Design Thinking, Design Educator/Mentor,Visual Communication Design,

3 年

Very insightful,very simple way things are pretty clearly explained.very interesting.thanks.Rafael Oliveira

1 次回应

Diego Monteiro

Tech Lead/Full Stack Developer na Integritas Solutions, Inc

3 年

Interesting! I like

1 次回应

查看更多评论

要查看或添加评论，请登录

Computer Vision & Data Annotation - An easy way of understanding the relevance in the real world

Rafael Oliveira

Principal Solutions Architect at iMerit Technology | Product Manager | Autonomous Mobility | Computer Vision | GenAI

Data Annotation

领英推荐

社区洞察

其他会员也浏览了

Explaining outputs of algorithms with the help of explainable AI

Agentic Systems - from Roomba to Mars

Unlocking the Future: Mastering End-to-End Management of Complex Tech Products with AI, Machine Vision, and Deep Learning.

A Complete Video Annotation Guide: Types, Challenges, Opportunities & Much More!

How various image annotation types are revolutionizing AI models across industries

How Artificial Intelligence Will Transform Businesses

VLM & Toys: An autonomous toy car with Generative AI (version 2)

The Importance of Data and Video Annotation Nowadays

Neural Network-Based Control Systems for Autonomous Manufacturing

How S4D Is Transforming Emotional AI