登录查看更多内容

How to do Object Tracking of ROI in OpenCV

Tejas Shastrakar

Masters Student Mechatronics and Robotics | ADAS and Computer Vision Enthusiast | Ex TCSer | Machine Learning | Deep Learning

发布日期: 2024年4月14日

Object tracking is fundamental in computer vision, with applications in various fields such as security, surveillance, robotics, and self-driving cars. Object tracking aims to locate and track a particular object in a video sequence over time. Object tracking is challenging due to the object's motion, occlusion, and changes in the object's appearance.

Object Tracking Algorithms

1. BOOSTING (Booosting Tracker):

- Principle: It's based on the AdaBoost algorithm, which selects a set of weak classifiers and combines them to form a strong classifier. Each weak classifier focuses on a particular feature.

- Accuracy: It's not very robust to occlusion or fast motion and might lose track in such scenarios.

- Performance: It's computationally less expensive compared to some other algorithms, but it may not perform as well in complex scenes.

2. MIL (Multiple Instance Learning):

- Principle: MIL considers the problem of tracking as a bag of instances where each bag contains multiple instances of the object to be tracked. It tries to learn the object appearance from positive and negative examples.

- Accuracy: MIL is more robust than Boosting but may still struggle with occlusion and rapid motion.

- Performance: It's more computationally intensive compared to Boosting but generally more accurate.

3. KCF (Kernelized Correlation Filters):

- Principle: KCF employs a fast method to find correlation between the object template and the candidate patches in the frame. It operates in the frequency domain, making it faster.

- Accuracy: KCF is quite accurate and robust to changes in scale and rotation.

- Performance: It's faster compared to Boosting and MIL, making it suitable for real-time applications.

4. TLD (Tracking, Learning and Detection):

- Principle: TLD combines object tracking with a detector, which helps in recovering from tracking failures. It employs a sliding window detector and online learning.

- Accuracy: TLD can handle occlusion and object appearance changes better than some other trackers.

- Performance: It's relatively slower compared to some other algorithms due to the involvement of a detector.

5. MEDIANFLOW:

- Principle: It utilizes the median motion model to estimate the object's position. It's particularly good at handling motion blur and occlusion.

- Accuracy: It's quite accurate and robust under challenging conditions.

- Performance: It's reasonably fast and suitable for real-time applications.

6. MOSSE (Minimum Output Sum of Squared Error):

- Principle: MOSSE is a simple and efficient tracker based on correlation filters. It uses the circulant structure of the training samples to speed up the computation.

领英推荐

The Impact of Artificial Intelligence in Business

Doug Rose 1 个月前

Community Friday #11: Gesture-based interactions + The…

OpenCV 1 年前

Robotic Foundation Models and Physical AI Models:…

Anand Ramachandran 1 个月前

- Accuracy: MOSSE is fast and suitable for applications where computational resources are limited, but it might not be as accurate as some other algorithms.

- Performance: It's one of the fastest trackers and can perform well in real-time scenarios.

7. CSRT (Channel and Spatial Reliability Tracker):

- Principle: CSRT combines the strengths of correlation filters and spatial reliability to improve tracking accuracy. It utilizes both appearance and motion information for tracking.

- Accuracy: CSRT is highly accurate and robust to occlusion, scale, and rotation changes.

- Performance: It's computationally more intensive compared to some other trackers but provides superior accuracy, making it suitable for applications where accuracy is crucial.

Comparison:

- In terms of accuracy, CSRT and KCF generally perform the best, followed by MEDIANFLOW and TLD.

- For real-time applications where computational resources are limited, MOSSE and KCF are good choices due to their speed.

- Boosting and MIL are generally less accurate compared to other algorithms and may struggle with challenging scenarios.

Choosing the right algorithm depends on the specific requirements of the application, including accuracy, computational resources, and the nature of the objects being tracked.

Code:

import cv2
import sys
from random import randint

This part imports necessary modules. cv2 is OpenCV library used for computer vision tasks. sys is used for system-related functions. randint is imported from random module for generating random integers.

tracker_types = ['BOOSTING', 'MIL', 'KCF', 'TLD', 'MEDIANFLOW', 'MOSSE', 'CSRT']
tracker_type = tracker_types[6]  ##Change the index according to desired algo
print(tracker_type)

if tracker_type == 'BOOSTING':
    tracker = cv2.legacy.TrackerBoosting_create()
elif tracker_type == 'MIL':
    tracker = cv2.legacy.TrackerMIL_create()
elif tracker_type == 'KCF':
    tracker = cv2.legacy.TrackerKCF_create()
elif tracker_type == 'TLD':
    tracker = cv2.legacy.TrackerTLD_create()
elif tracker_type == 'MEDIANFLOW':
    tracker = cv2.legacy.TrackerMedianFlow_create()
elif tracker_type == 'MOSSE':
    tracker = cv2.legacy.TrackerMOSSE_create()
elif tracker_type == 'CSRT':
    tracker = cv2.legacy.TrackerCSRT_create()
    
print(tracker)

Here, a list tracker_types is defined containing various tracker types available in OpenCV. tracker_type is then set to the 7th (index 6) item in the list, which corresponds to 'CSRT' tracker. The selected tracker type is then printed.
This block initializes the selected tracker based on the value of tracker_type. Depending on the tracker type, corresponding tracker creation function is called from OpenCV and the tracker object is created.

video = cv2.VideoCapture('race.mp4')
if not video.isOpened():
    print('Error while loading the video!')
    sys.exit()

ok, frame = video.read()
if not ok:
    print('Erro while loading the frame!')
    sys.exit()
print(ok)

Here, a video file named 'race.mp4' is opened for reading using cv2.VideoCapture(). If the video file fails to open, an error message is printed and the program exits. Then, the first frame of the video is read and checked for success. If reading the frame fails, an error message is printed and the program exits.

bbox = cv2.selectROI(frame) # region of interest
print(bbox)

ok = tracker.init(frame, bbox)
print(ok)

colors = (randint(0, 255), randint(0,255), randint(0, 255)) # RGB -> BGR
print(colors)


while True:
    ok, frame = video.read()
    #print(ok)
    if not ok:
        break

    ok, bbox = tracker.update(frame)
    #print(ok, bbox)
    if ok == True:
        (x, y, w, h) = [int(v) for v in bbox]
        #print(x, y, w, h)
        cv2.rectangle(frame, (x, y), (x + w, y + h), colors, 2)
    else:
        cv2.putText(frame, 'Tracking failure!', (100,80), cv2.FONT_HERSHEY_SIMPLEX, .75, (0,0,255))

    cv2.putText(frame, tracker_type, (100, 20), cv2.FONT_HERSHEY_SIMPLEX, .75, (0, 0, 255))

    cv2.imshow('Tracking', frame)
    if cv2.waitKey(1) & 0XFF == 27: # esc
        break

This part selects a region of interest (ROI) from the first frame of the video using cv2.selectROI(). The coordinates and size of the bounding box around the selected ROI are stored in bbox. Then, the tracker is initialized with the first frame and the bounding box using tracker.init(). The result of initialization is stored in ok. Random color values are generated for visualizing the tracking.
This part of the code starts an infinite loop where each iteration reads the next frame from the video. If reading the frame fails, the loop breaks. Then, the tracker's update() method is called with the current frame, which attempts to update the tracker's position. If the update is successful (ok == True), a rectangle is drawn around the tracked object using the bounding box coordinates. If the update fails, a message indicating tracking failure is displayed. Additionally, the tracker type is displayed on the frame. Finally, the frame with tracking information is displayed, and the loop continues until the user presses the 'esc' key (ASCII value 27), at which point the program exits.

Full Code: https://github.com/TejasShastrakar/Computer_Vision.git

Journey into Computer Vision

374 位关注者

要查看或添加评论，请登录

Tejas Shastrakar的更多文章

How Recognition of gestures and actions works ?

2024年5月26日

How Recognition of gestures and actions works ?

In the dynamic field of artificial intelligence, the ability to recognize gestures and actions from photos and videos…
How to work with Autoencoders ?

2024年5月15日

How to work with Autoencoders ?

Autoencoders are a class of neural networks that are used in unsupervised learning tasks. They have two neural networks…
How classification of human emotions works using CNN ?

2024年5月11日

How classification of human emotions works using CNN ?

Introduction: In today’s digital age, the field of computer vision has witnessed remarkable advancements, enabling…
What is Transfer Learning & how it works for Image Classification?

2024年5月9日

What is Transfer Learning & how it works for Image Classification?

It can take weeks to train a neural network on large datasets. Luckily, this time can be shortened thanks to model…
How Convolutional Neural Networks (CNNs) for Image Classification Works ?

2024年5月4日

How Convolutional Neural Networks (CNNs) for Image Classification Works ?

What is a Convolutional Neural Network (CNN)? A Convolutional Neural Network (CNN), also known as ConvNet, is a…
Neural Network for Image Classification using Colour Feature Extraction

2024年5月1日

Neural Network for Image Classification using Colour Feature Extraction

In the vast landscape of artificial intelligence, image classification stands tall as a pivotal task, enabling machines…
How Neural network for Image Classification works ?

2024年4月28日

How Neural network for Image Classification works ?

Neural networks for image classification represent a sophisticated approach to teaching computers to interpret and…

2 条评论
How to do Face Recognition Using LBPH Algorithm

2024年4月10日

How to do Face Recognition Using LBPH Algorithm

LBPH (Local Binary Pattern Histogram) is a Face-Recognition algorithm it is used to recognize the face of a person. It…
How to do Face detection with dlib (HOG and CNN)

2024年4月7日

How to do Face detection with dlib (HOG and CNN)

Dlib is an open-source software library primarily written in C++, with Python bindings available. It provides a wide…

1 条评论
Face detection using Cascade Classifier using OpenCV-Python

2024年4月6日

Face detection using Cascade Classifier using OpenCV-Python

Face detection using OpenCV with Haar Cascade Classifiers is a fundamental technique in computer vision that allows…

See all articles

How to do Object Tracking of ROI in OpenCV

Tejas Shastrakar

Masters Student Mechatronics and Robotics | ADAS and Computer Vision Enthusiast | Ex TCSer | Machine Learning | Deep Learning

Object Tracking Algorithms

1. BOOSTING (Booosting Tracker):

2. MIL (Multiple Instance Learning):

3. KCF (Kernelized Correlation Filters):

4. TLD (Tracking, Learning and Detection):

5. MEDIANFLOW:

6. MOSSE (Minimum Output Sum of Squared Error):

领英推荐

7. CSRT (Channel and Spatial Reliability Tracker):

Comparison:

Code:

Journey into Computer Vision

374 位关注者

Tejas Shastrakar的更多文章

社区洞察

其他会员也浏览了

Disney's Cutest Robot Introduced Briefly

2022: My Top 10 Picks for Educational Content on Synthetic Data for Computer Vision

AI, MLOps & Robotics Newsletter #98

AI, MLOps & Robotics Newsletter #96

AI, MLOps & Robotics Newsletter #83

AI, MLOps & Robotics Newsletter #81

AI, MLOps & Robotics Newsletter #93

AI, MLOps & Robotics Newsletter #72

AI, MLOps & Robotics Newsletter #92

AI, MLOps & Robotics Newsletter #95

Object Tracking Algorithms

1. BOOSTING (Booosting Tracker):

2. MIL (Multiple Instance Learning):

3. KCF (Kernelized Correlation Filters):

4. TLD (Tracking, Learning and Detection):

5. MEDIANFLOW:

6. MOSSE (Minimum Output Sum of Squared Error):

领英推荐

7. CSRT (Channel and Spatial Reliability Tracker):

Comparison:

Code:

Journey into Computer Vision

374 位关注者

Tejas Shastrakar的更多文章

How Recognition of gestures and actions works ?

How to work with Autoencoders ?

How classification of human emotions works using CNN ?

What is Transfer Learning & how it works for Image Classification?

How Convolutional Neural Networks (CNNs) for Image Classification Works ?

Neural Network for Image Classification using Colour Feature Extraction

How Neural network for Image Classification works ?

How to do Face Recognition Using LBPH Algorithm

How to do Face detection with dlib (HOG and CNN)

Face detection using Cascade Classifier using OpenCV-Python

社区洞察

其他会员也浏览了

Disney's Cutest Robot Introduced Briefly

2022: My Top 10 Picks for Educational Content on Synthetic Data for Computer Vision

AI, MLOps & Robotics Newsletter #98

AI, MLOps & Robotics Newsletter #96

AI, MLOps & Robotics Newsletter #83

AI, MLOps & Robotics Newsletter #81

AI, MLOps & Robotics Newsletter #93

AI, MLOps & Robotics Newsletter #72

AI, MLOps & Robotics Newsletter #92

AI, MLOps & Robotics Newsletter #95