Evaluation of object detection models boils down to one thing: determining if a detection is valid or not
Determining whether detection is valid requires understanding the Intersection over Union metric (IoU).
This article covers the following:
Basics of IoU — What is IoU?How to compute (theoretically and in Python code) IoU for a single pair of detection and ground truth bounding boxesComputing IoU for multiple sets of predicted and ground truth bounding boxes.How to interpret IoU value?
IoU is a core metric for the evaluation of object detection models. It measures the accuracy of the object detector by evaluating the degree of overlap between the detection box and the ground truth box.
A ground truth box or label is an annotated box showing where the object is (the annotation is often done by hand, and the ground truth box is considered the object’s actual position).The detection box or predicted bounding box is the prediction from the object detector.
Formally, IoU is the area of intersection between the ground truth (gt) and predicted box (pd) divided by the union of the two boxes.
Let’s start off with a simple example. Computing IoU for one detection and a ground truth.
To do that, we will need the top-left (x1, y1) and bottom-right (x2, y2) coordinates of the two boxes.
In the Figure below (right), we have two bounding boxes:
Predicted bounding box (p-box): (px1, py1, px2, py2) = (859, 31, 1002, 176)Ground truth bounding box (t-box): (tx1, ty1, tx2, ty2) = (860, 68, 976, 184)