多分类问题的指标计算
在机器学习中,常用的分类指标有 Accuracy,Precision,Recall 和 F1-Score。
二分类指标
对于二分类问题,可将样例根据真实类别与学习器预测类别的组合划分为真正例(true positive)、假正例(false positve)、真反例(true negative)、假反例(false negative)四种情形,可以得出分类结果的混淆矩阵:
其中,查准率
P
r
e
c
i
s
i
o
n
=
T
P
T
P
+
F
P
Precision = \frac{TP}{TP+FP}
Precision=TP+FPTP,查全率
R
e
c
a
l
l
=
T
P
T
P
+
F
N
Recall = \frac{TP}{TP+FN}
Recall=TP+FNTP
A c c u r a c y = T P + T N T P + T N + F P + F N Accuracy = \frac{TP+TN}{TP+TN+FP+FN} Accuracy=TP+TN+FP+FNTP+TN
F 1 − S c o r e = 2 × p r e c i s i o n × r e c a l l p r e c i s i o n + r e c a l l F1-Score = \frac{2×precision×recall}{precision+recall} F1−Score=precision+recall2×precision×recall
多分类指标
多分类的混淆矩阵有多个类别:
在多分类的混淆矩阵中,正确的分类样本分布在左上到右下的对角线上。其中,Accuracy 的定义为分类正确(对角线)的样本数与总样本数的比值。Accuracy 表示全局样本的分类情况,而 precision 和 recall 需要针对每个类单独计算。
举例:对于类别1,其 precision 和 recall 为:
p r e c i s i o n 1 = T P T P + F P = 10 10 + 10 + 10 = 1 3 precision_{1} = \frac{TP}{TP+FP} = \frac{10}{10+10+10} = \frac{1}{3} precision1=TP+FPTP=10+10+1010=31
r e c a l l 1 = T P T P + F N = 10 10 + 20 + 30 = 1 6 recall_{1} = \frac{TP}{TP+FN} = \frac{10}{10+20+30} = \frac{1}{6} recall1=TP+FNTP=10+20+3010=61
其他类别也同理计算:
p r e c i s i o n 2 = T P T P + F P = 30 20 + 30 + 40 = 1 3 precision_{2} = \frac{TP}{TP+FP} = \frac{30}{20+30+40} = \frac{1}{3} precision2=TP+FPTP=20+30+4030=31
r e c a l l 2 = T P T P + F N = 30 10 + 30 + 40 = 3 8 recall_{2} = \frac{TP}{TP+FN} = \frac{30}{10+30+40} = \frac{3}{8} recall2=TP+FNTP=10+30+4030=83
p r e c i s i o n 3 = T P T P + F P = 50 30 + 40 + 50 = 5 12 precision_{3} = \frac{TP}{TP+FP} = \frac{50}{30+40+50} = \frac{5}{12} precision3=TP+FPTP=30+40+5050=125
r e c a l l 3 = T P T P + F N = 50 10 + 40 + 50 = 1 2 recall_{3} = \frac{TP}{TP+FN} = \frac{50}{10+40+50} = \frac{1}{2} recall3=TP+FNTP=10+40+5050=21
若想要评估该系统的总体性能,则有以下几种方法
首先
A c c u r a c y = T P 1 + T P 2 + T P 3 Σ i = 1 3 ( T P i + T N i + F P i + F N i ) = 10 + 30 + 50 60 + 80 + 100 = 3 8 = 0.3750 Accuracy = \frac{TP_{1}+TP_{2}+TP_{3}}{\Sigma_{i=1}^{3} (TP_{i}+TN_{i}+FP_{i}+FN_{i})} = \frac{10+30+50}{60+80+100} = \frac{3}{8} = 0.3750 Accuracy=Σi=13(TPi+TNi+FPi+FNi)TP1+TP2+TP3=60+80+10010+30+50=83=0.3750
1. Macro-average
直接将不同类别的指标相加再求平均,也就是所有类别权重相同。该方法把每个类别平等看待,但是最终结果受到稀有类别影响。
M a c r o − P r e c i s i o n = p r e c i s i o n 1 + p r e c i s i o n 2 + p r e c i s i o n 3 3 = 13 36 = 0.3611 Macro-Precision = \frac{precision_{1} + precision_{2} + precision_{3}}{3} = \frac{13}{36} = 0.3611 Macro−Precision=3precision1+precision2+precision3=3613=0.3611
M a c r o − R e c a l l = r e c a l l 1 + r e c a l l 2 + r e c a l l 3 3 = 25 72 = 0.3472 Macro-Recall = \frac{recall_{1} + recall_{2} + recall_{3}}{3} = \frac{25}{72} = 0.3472 Macro−Recall=3recall1+recall2+recall3=7225=0.3472
2. Micro-average
该方法把每个类别的TP、FP、FN先相加之后,再根据二分类的计算方法来计算。
M i c r o − P r e c i s i o n = T P 1 + T P 2 + T P 3 T P 1 + T P 2 + T P 3 + F P 1 + F P 2 + F P 3 = 10 + 30 + 50 30 + 90 + 120 = 3 8 = 0.3750 Micro-Precision = \frac{TP_{1} + TP_{2} + TP_{3}}{TP_{1} + TP_{2} + TP_{3} + FP_{1} + FP_{2} + FP_{3}} = \frac{10+30+50}{30+90+120} = \frac{3}{8} = 0.3750 Micro−Precision=TP1+TP2+TP3+FP1+FP2+FP3TP1+TP2+TP3=30+90+12010+30+50=83=0.3750
M i c r o − R e c a l l = T P 1 + T P 2 + T P 3 T P 1 + T P 2 + T P 3 + F N 1 + F N 2 + F N 3 = 10 + 30 + 50 60 + 80 + 100 = 3 8 = 0.3750 Micro-Recall = \frac{TP_{1} + TP_{2} + TP_{3}}{TP_{1} + TP_{2} + TP_{3} + FN_{1} + FN_{2} + FN_{3}} = \frac{10 + 30 + 50}{60+80+100} = \frac{3}{8} = 0.3750 Micro−Recall=TP1+TP2+TP3+FN1+FN2+FN3TP1+TP2+TP3=60+80+10010+30+50=83=0.3750
可以发现,总体的 Accuracy 和 Micro-Precision 和 Micro-Recall 是相同的,它们都计算了对角线样本数和总样本数的比值。