[科研] | 101级别解释 | Confusion matrix, Accuracy, Precision, Recall and F1 score

原创已于 2023-03-06 18:08:02 修改 · 316 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#其他

于 2023-03-01 21:17:14 首次发布

本文详细解释了评估分类算法性能的关键指标，包括混淆矩阵、精确度、召回率、准确性以及AUC。混淆矩阵概述了模型预测各类别的效果，精确度衡量正确预测的正例数量，召回率则关注所有可能的正例中被正确预测的比例。AUC是ROC曲线下的面积，用于评估返回概率得分的分类器。F1分数是精确度和召回率的加权平均值，综合考虑两者的表现。

Interpretation of Performance Measures

Different assessment metrics of a classification algorithm for analysing purpose:
- Confusion Matrix
- Precision
- Recall
- Accuracy
- Area under ROC curve(AUC)

1. Confusion Matrix

A table that summarises how successful the classification model is at predicting examples belonging to various classes.
- One axis is the label that the model predicted,
- The other axis is the actual label.
An example:

2. Precision

Precision is a metric that quantifies the number of correct positive predictions made.
Precision for binary classification:
$\frac{TP}{TP + FP}$

3. Recall

Recall is a metric that quantifies the number of correct positive predictions made out of all positive predictions that could have been made (which means that recall provides an indication of missed positive predictions).
Recall for binary classification:
$\frac{TP}{TP+FN}$

4. Accuracy

Accuracy is a metric that quantifies how close a measurement is to the true or accepted value (which means accuracy takes into account correctly predicted negative predictions).
$\frac{TP+TN}{TP+FP+TN+FN}$

5. Area under the ROC curve(AUC)

It can only be used to assess classifiers that return some confidence score (or a probability) of prediction. For example, logistic regression, neural networks and decision trees (and ensemble models based on decision trees) can be assessed using ROC curves.
ROC curve commonly use the combination of true positive rate(TPR) and false positive rate(FPR) and that is given as:
$\frac{TP}{TP+FN}$
$\frac{FP}{FP+TN}$
The higher the area under the ROC curve(AUC), the better the classifier.
An example of the area and ROC curves.

F1 score

F1 score is a weighted average of precision and recall.
$F1\;score = \frac{2}{\frac{1}{precision}+\frac{1}{recall}} = \frac{2\times precision\times recall}{precision+recall}$