This competition is evaluated on the F2 Score at different intersection over union (IoU) thresholds. The IoU of a proposed set of object pixels and a set of true object pixels is calculated as:
The metric sweeps over a range of IoU thresholds, at each point calculating an F2 Score. The threshold values range from 0.5 to 0.95 with a step size of 0.05: (0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95). In other words, at a threshold of 0.5, a predicted object is considered a "hit" if its intersection over union with a ground truth object is greater than 0.5.
At each threshold value tt, the F2 Score value is calculated based on the number of true positives (TP), false negatives (FN), and false positives (FP) resulting from comparing the predicted object to all ground truth objects. The following equation is equivalent to F2 Score when β is set to 2:
A true positive is counted when a single predicted object matches a ground truth object with an IoU above the threshold. A false positive indicates a predicted object had no associated ground truth object. A false negative indicates a ground truth object had no associated predicted object. The average F2 Score of a single image is then calculated as the mean of the above F2 Score values at each IoU threshold:
Lastly, the score returned by the competition metric is the mean taken over the individual average F2 Scores of each image in the test dataset.
参考:https://www.kaggle.com/c/airbus-ship-detection#evaluation
本文深入解析了F2Score评估指标,介绍了其在不同交并比(IoU)阈值下的计算方法,涵盖从0.5到0.95的范围。详细说明了真阳性、假阳性和假阴性的定义,并展示了如何计算单张图像的平均F2Score,最后汇总了整个测试数据集的平均F2Score。
2190

被折叠的 条评论
为什么被折叠?



