总算搞懂了多标签分类里的P（Precision）、R（Recall）、AP（Average Precision）和mAP（mean Average Precision）！

最新推荐文章于 2025-04-21 15:33:44 发布

starleeisamyth

最新推荐文章于 2025-04-21 15:33:44 发布

阅读量1.8k

点赞数

文章标签：经验分享

本文链接：https://blog.youkuaiyun.com/qq_41731507/article/details/131426637

版权

记录一下自己理解的多标签分类任务中的P（Precision）、R（Recall）、AP（Average Precision）、mAP（mean Average Precision）。

1. 混淆矩阵

首先理解一下子混淆矩阵，对于多标签任务中的每一类标签而言都有：

	真值=1	真值=0
预测值=1	True Positive(TP)	False Negative(FN)
预测值=0	False Positive(FP)	True Negative(TN)

Precision和Recall的分子都一样，为预测正确的正样本（其中划分正样本和负样本的时候往往需要一个阈值，threshold，通常设为0.5，顺便提一下在目标检测任务中常见的AP@[0.5, 0.05, 0.95]代表着交并比IoU的阈值以0.05的增幅，从0.5增加到0.95）。
其中P（Precision）表示在所有看上去是正解的样本中TP的占比，分母为预测出来的某一类样本数。公式为： $Precision=\frac{True Positives}{All Samples}=\frac{TP}{TP+FP}$ ,
而R（Recall）表示在所有实实在在的正解中TP的占比，分母为某一类样本的所有Ground Truth数量。公式为： $Recall=\frac{True Positives}{All GroundTruths}=\frac{TP}{TP+FN}$ 。

2. PR曲线（Precision Recall Curve）

顾名思义，是由Precision（纵坐标）和Recall（横坐标）构成的曲线。
在这里插入图片描述
（图片来源：AP，mAP计算详解（代码全解））
具体来说，举个例子，在多标签的管道缺陷分类任务中，对于“Crack”裂缝这一类的六张训练集图片而言，首先将预测值按从大到小的顺序进行排序，可直接使用numpy函数实现：numpy.argsort(a, axis=-1, kind=None, order=None)，然后按预测分数由高到低计算TP的累计个数，因为Precision和Recall的分子都是TP的个数，只是Recall由于分母固定会逐渐增大，且随着TP数量的增加而增大，而Precision会由于分子分母都在变化而呈现出忽高忽低但总体下降的趋势，具体可见下表。
在这里插入图片描述

3. AP（Average Precision）

根据Wikipedia的定义，通过记录每一对Precision和Recall，我们可以绘制出P相对于R的函数曲线 $p (r)$ ，即PR曲线，而Average Precision就是计算函数 $p (r)$ 在定义域 $r\in[0,1]$ 的平均值：
$\begin{aligned} AveP= \int_0^1 p(r) \mathrm{d} r \end{aligned}$
表现在图上就是计算函数与坐标轴围成的面积。至于为什么定义域是从0到1，是因为Recall的取值就是从0到1。既然是函数图像的取值比较开，图像呈锯齿状，那么做积分计算就有点把问题复杂化，简化方案也是微积分发展未成熟时使用的方案——近似法。
下面介绍主流的两种方法：

For example, the PASCAL Visual Object Classes challenge (a benchmark for computer vision object detection) until 2010 computed the average precision by averaging the precision over a set of evenly spaced recall levels {0, 0.1, 0.2, … 1.0}:

第一种方法从0到1等间距0.1划分矩形：
在这里插入图片描述

这里的 $P_{interp}(r)$ 表示在 $[r, 1]$ 中找Precision最大值，也即在从当前值往后的长度里找最大的Precision作为矩形的高，最后求这11个数的平均值。

在这里插入图片描述

第一种方法带来的问题就是，这些框没有很好地近似曲线，因此引入第二种方法。

However, from VOC 2010, the computation of AP changed.
Compute a version of the measured precision-recall curve with precision monotonically decreasing, by setting the precision for recall r to the maximum precision obtained for any recall $\tilde{r} \geq r$ .Then compute the AP as the area under this curve by numerical integration.