How to do Object Tracking of ROI in OpenCV

How to do Object Tracking of ROI in OpenCV

Object tracking is fundamental in computer vision, with applications in various fields such as security, surveillance, robotics, and self-driving cars. Object tracking aims to locate and track a particular object in a video sequence over time. Object tracking is challenging due to the object's motion, occlusion, and changes in the object's appearance.

Object Tracking Algorithms

1. BOOSTING (Booosting Tracker):

- Principle: It's based on the AdaBoost algorithm, which selects a set of weak classifiers and combines them to form a strong classifier. Each weak classifier focuses on a particular feature.

- Accuracy: It's not very robust to occlusion or fast motion and might lose track in such scenarios.

- Performance: It's computationally less expensive compared to some other algorithms, but it may not perform as well in complex scenes.

2. MIL (Multiple Instance Learning):

- Principle: MIL considers the problem of tracking as a bag of instances where each bag contains multiple instances of the object to be tracked. It tries to learn the object appearance from positive and negative examples.

- Accuracy: MIL is more robust than Boosting but may still struggle with occlusion and rapid motion.

- Performance: It's more computationally intensive compared to Boosting but generally more accurate.

3. KCF (Kernelized Correlation Filters):

- Principle: KCF employs a fast method to find correlation between the object template and the candidate patches in the frame. It operates in the frequency domain, making it faster.

- Accuracy: KCF is quite accurate and robust to changes in scale and rotation.

- Performance: It's faster compared to Boosting and MIL, making it suitable for real-time applications.

4. TLD (Tracking, Learning and Detection):

- Principle: TLD combines object tracking with a detector, which helps in recovering from tracking failures. It employs a sliding window detector and online learning.

- Accuracy: TLD can handle occlusion and object appearance changes better than some other trackers.

- Performance: It's relatively slower compared to some other algorithms due to the involvement of a detector.

5. MEDIANFLOW:

- Principle: It utilizes the median motion model to estimate the object's position. It's particularly good at handling motion blur and occlusion.

- Accuracy: It's quite accurate and robust under challenging conditions.

- Performance: It's reasonably fast and suitable for real-time applications.

6. MOSSE (Minimum Output Sum of Squared Error):

- Principle: MOSSE is a simple and efficient tracker based on correlation filters. It uses the circulant structure of the training samples to speed up the computation.

- Accuracy: MOSSE is fast and suitable for applications where computational resources are limited, but it might not be as accurate as some other algorithms.

- Performance: It's one of the fastest trackers and can perform well in real-time scenarios.

7. CSRT (Channel and Spatial Reliability Tracker):

- Principle: CSRT combines the strengths of correlation filters and spatial reliability to improve tracking accuracy. It utilizes both appearance and motion information for tracking.

- Accuracy: CSRT is highly accurate and robust to occlusion, scale, and rotation changes.

- Performance: It's computationally more intensive compared to some other trackers but provides superior accuracy, making it suitable for applications where accuracy is crucial.

Comparison:

- In terms of accuracy, CSRT and KCF generally perform the best, followed by MEDIANFLOW and TLD.

- For real-time applications where computational resources are limited, MOSSE and KCF are good choices due to their speed.

- Boosting and MIL are generally less accurate compared to other algorithms and may struggle with challenging scenarios.

Choosing the right algorithm depends on the specific requirements of the application, including accuracy, computational resources, and the nature of the objects being tracked.

Code:

import cv2
import sys
from random import randint        

  • This part imports necessary modules. cv2 is OpenCV library used for computer vision tasks. sys is used for system-related functions. randint is imported from random module for generating random integers.

tracker_types = ['BOOSTING', 'MIL', 'KCF', 'TLD', 'MEDIANFLOW', 'MOSSE', 'CSRT']
tracker_type = tracker_types[6]  ##Change the index according to desired algo
print(tracker_type)

if tracker_type == 'BOOSTING':
    tracker = cv2.legacy.TrackerBoosting_create()
elif tracker_type == 'MIL':
    tracker = cv2.legacy.TrackerMIL_create()
elif tracker_type == 'KCF':
    tracker = cv2.legacy.TrackerKCF_create()
elif tracker_type == 'TLD':
    tracker = cv2.legacy.TrackerTLD_create()
elif tracker_type == 'MEDIANFLOW':
    tracker = cv2.legacy.TrackerMedianFlow_create()
elif tracker_type == 'MOSSE':
    tracker = cv2.legacy.TrackerMOSSE_create()
elif tracker_type == 'CSRT':
    tracker = cv2.legacy.TrackerCSRT_create()
    
print(tracker)        

  • Here, a list tracker_types is defined containing various tracker types available in OpenCV. tracker_type is then set to the 7th (index 6) item in the list, which corresponds to 'CSRT' tracker. The selected tracker type is then printed.
  • This block initializes the selected tracker based on the value of tracker_type. Depending on the tracker type, corresponding tracker creation function is called from OpenCV and the tracker object is created.

video = cv2.VideoCapture('race.mp4')
if not video.isOpened():
    print('Error while loading the video!')
    sys.exit()

ok, frame = video.read()
if not ok:
    print('Erro while loading the frame!')
    sys.exit()
print(ok)        

  • Here, a video file named 'race.mp4' is opened for reading using cv2.VideoCapture(). If the video file fails to open, an error message is printed and the program exits. Then, the first frame of the video is read and checked for success. If reading the frame fails, an error message is printed and the program exits.

bbox = cv2.selectROI(frame) # region of interest
print(bbox)

ok = tracker.init(frame, bbox)
print(ok)

colors = (randint(0, 255), randint(0,255), randint(0, 255)) # RGB -> BGR
print(colors)


while True:
    ok, frame = video.read()
    #print(ok)
    if not ok:
        break

    ok, bbox = tracker.update(frame)
    #print(ok, bbox)
    if ok == True:
        (x, y, w, h) = [int(v) for v in bbox]
        #print(x, y, w, h)
        cv2.rectangle(frame, (x, y), (x + w, y + h), colors, 2)
    else:
        cv2.putText(frame, 'Tracking failure!', (100,80), cv2.FONT_HERSHEY_SIMPLEX, .75, (0,0,255))

    cv2.putText(frame, tracker_type, (100, 20), cv2.FONT_HERSHEY_SIMPLEX, .75, (0, 0, 255))

    cv2.imshow('Tracking', frame)
    if cv2.waitKey(1) & 0XFF == 27: # esc
        break        

  • This part selects a region of interest (ROI) from the first frame of the video using cv2.selectROI(). The coordinates and size of the bounding box around the selected ROI are stored in bbox. Then, the tracker is initialized with the first frame and the bounding box using tracker.init(). The result of initialization is stored in ok. Random color values are generated for visualizing the tracking.
  • This part of the code starts an infinite loop where each iteration reads the next frame from the video. If reading the frame fails, the loop breaks. Then, the tracker's update() method is called with the current frame, which attempts to update the tracker's position. If the update is successful (ok == True), a rectangle is drawn around the tracked object using the bounding box coordinates. If the update fails, a message indicating tracking failure is displayed. Additionally, the tracker type is displayed on the frame. Finally, the frame with tracking information is displayed, and the loop continues until the user presses the 'esc' key (ASCII value 27), at which point the program exits.

Tracking output

Full Code: https://github.com/TejasShastrakar/Computer_Vision.git



要查看或添加评论,请登录

Tejas Shastrakar的更多文章

社区洞察

其他会员也浏览了