登录查看更多内容

Understanding Grounding Dino's Thresholds: A Deeper Dive

Elven Kee

Chartered Engineer with 20 years of enginering experience in Automation Control & AI, specialising in Marine Integrated Control and Safety Systems ??A dedicated & passionate technical trainer and mentor

发布日期: 2024年10月1日

Grounding Dino (GD) and YOLOv8 are both powerful object detection models, but they employ slightly different strategies for filtering predictions. One area of confusion often lies in the thresholds used by GD. To explain the threshold, let's understand 2 key terms in GD. box_threshold and text_threshold:

box_threshold: Controls the minimum confidence score required for a predicted bounding box to be considered valid.
text_threshold: Similarly, determines the minimum confidence score for a detected text region to be considered valid.

Now let's relate the 2 similar terms used widely in YOLOv8.

Confidence: In YOLOv8, the confidence parameter is used to filter predictions based on their likelihood of being correct. It's analogous to GD's thresholds.
IOU: YOLOv8 also utilizes IOU (Intersection over Union) to evaluate the accuracy of predicted bounding boxes. It's a metric that measures the overlap between a predicted and ground truth bounding box.

Key differences:

GD's thresholds directly control the confidence level for bounding box and text region predictions.
YOLOv8's confidence is used in conjunction with IOU to filter predictions based on both confidence and localization accuracy.
While both models use confidence-based filtering, the specific implementation and the role of other metrics (like IOU) can differ.

Here is the summary for better understanding.

In essence, GD's thresholds are a direct way to control the confidence level for predictions, similar to the confidence parameter in YOLOv8. However, GD's approach is more focused on specific prediction types (bounding boxes and text regions), while YOLOv8 combines confidence with IOU for a more comprehensive evaluation.

要查看或添加评论，请登录

Elven Kee的更多文章

How to run Yolov8 segmentation on Raspberry Pi (from?scratch)

2025年3月8日

How to run Yolov8 segmentation on Raspberry Pi (from?scratch)

Have you tried all the YOLOv5 models, and you are eager to work with the latest YOLOv8 model? And not just Object…

1 条评论
CLIP by OpenAI — by first running the colab

2024年12月9日

CLIP by OpenAI — by first running the colab

CLIP uses modern architecture like Transformer and predicts the text description “a photo of a dog” or “a photo of a…
How to turn your Raspberry Pi into small?ChatGPT

2024年2月16日

How to turn your Raspberry Pi into small?ChatGPT

Join me on a new journey as I explore the use of the Large Language Model (LLM) on a Raspberry Pi! To begin with, let's…

3 条评论
How to run Yolov8 segmentation on Raspberry Pi (from scratch)

2023年11月15日

How to run Yolov8 segmentation on Raspberry Pi (from scratch)

Have you tried all the YOLOv5 models, and you are eager to work with the latest YOLOv8 model? And not just Object…

2 条评论
Monocular Depth Estimation

2023年10月28日

Monocular Depth Estimation

Purchasing a 3D camera is a costly endeavour. Naturally, we can purchase two inexpensive cameras and use the stereo…
How to run YOLOv5 successfully on Raspberry Pi

2023年9月14日

How to run YOLOv5 successfully on Raspberry Pi

What is YOLOv5 and why is it so popular? YOLOv5 is an object detection algorithm developed by Ultralytics. It is an…
Difference between Pretrained model and native model for portable object detector

2023年2月25日

Difference between Pretrained model and native model for portable object detector

Portable object detectors are becoming increasingly popular due to their versatility and ease of use. They are used in…
Digital Twin of Collaborative Robot

2022年6月6日

Digital Twin of Collaborative Robot

A mobile collaborative robot consists of a collaborative robot manipulator mounted onto a mobile autonomous base. It…

1 条评论
The current trend of Industry 4.0 with OPC UA

2021年8月4日

The current trend of Industry 4.0 with OPC UA

OPC Unified Architecture is a machine-to-machine (M2M) protocol for industrial automation developed by OPC Foundation…
Virtual Engineers to pickup Industry 4.0 skills fast.

2021年7月18日

Virtual Engineers to pickup Industry 4.0 skills fast.

Be comfortable with all the new terms in Industry 4.0! Actually, some of the terms have been around for ages but…

See all articles

Understanding Grounding Dino's Thresholds: A Deeper Dive

Elven Kee

Chartered Engineer with 20 years of enginering experience in Automation Control & AI, specialising in Marine Integrated Control and Safety Systems ??A dedicated & passionate technical trainer and mentor

Elven Kee的更多文章

社区洞察

其他会员也浏览了

Attention Maps of Vision Transformers

RDD For Man

GATE 2024 DS&AI MCQ Series-4

What Are Happy Numbers And How To Find Them

LLMs = Stochastic Parrots

Occam's Razor: Simplifying Complexities with Probability

Edge Detection- CV

A Classic DP Problem: Longest Palindromic Substring

Area under receiver operating characteristic curve

GUM Quo Vadis.. and some Bayes

Elven Kee的更多文章

How to run Yolov8 segmentation on Raspberry Pi (from?scratch)

CLIP by OpenAI — by first running the colab

How to turn your Raspberry Pi into small?ChatGPT

How to run Yolov8 segmentation on Raspberry Pi (from scratch)

Monocular Depth Estimation

How to run YOLOv5 successfully on Raspberry Pi

Difference between Pretrained model and native model for portable object detector

Digital Twin of Collaborative Robot

The current trend of Industry 4.0 with OPC UA

Virtual Engineers to pickup Industry 4.0 skills fast.

社区洞察

其他会员也浏览了

Attention Maps of Vision Transformers

RDD For Man

GATE 2024 DS&AI MCQ Series-4

What Are Happy Numbers And How To Find Them

LLMs = Stochastic Parrots

Occam's Razor: Simplifying Complexities with Probability

Edge Detection- CV

A Classic DP Problem: Longest Palindromic Substring

Area under receiver operating characteristic curve

GUM Quo Vadis.. and some Bayes