-
AUC
AUC计算的关键是找到所有正样的预测值大于负样本预测值的正负样本对。如下表格,假设召回模型召回topk=4,分布为ABCD,其中真实样本中,B、D为正例(这里正例1代表用户点击的样本,0为未点击样本),那么该AUC计算如下:
N为样本的总数,m为正例的个数。那么表格中的
| 样本 | 真实样本 | 预测得分 | 排序 |
| A | 0 | 0.8 | 1 |
| B | 1 | 0.7 | 2 |
| C | 0 | 0.6 | 3 |
| D | 1 | 0.5 | 4 |
那么代码可以实现,如下:
# 计算topk召回的auc
def calculate_auc(recall_items: list, true_items: list):
N = len(recall_items)
if N == 0:
return 0
# hit_item = set(recall_items) & set(true_items) # 忽略重复点击情况
hit_item = [item for item in true_items if item in recall_items]
m = len(hit_item)
if m == N:
return 0
rank_i = [N - recall_items.index(i) for i in hit_item]
return (sum(rank_i) - (m + 1) * m / 2) / (m * (N - m))
- HR
这个比较好理解
# topk召回HR
def calulate_HR(recall_items: list, true_items: list):
N = len(recall_items)
M = len(true_items)
if N == 0 or M == 0:
return 0
hit_num = 0
for item in true_items:
if item in recall_items:
hit_num += 1
return hit_num / M
- Precision
Precision就是召回了K个item,K个item中被点击了n个, 那么Precision = n / K
# topk召回Precision
def calulate_Precision(recall_items: list, true_items: list):
N = len(recall_items)
M = len(true_items)
if N == 0 or M == 0:
return 0
hit_items = set(recall_items) & set(true_items) # 忽略重复点击情况
return len(hit_items) / N
- Recall
Recall是用户点击的M个item中,k个物品是在召回模型推荐列表的,那么Recall = k / M
# topk 召回Recall
def calulate_Recall(recall_items: list, true_items):
N = len(recall_items)
M = len(true_items)
if N == 0 or M == 0:
return 0
hit_items = [item for item in recall_items if item in true_items]
return len(hit_items) / M
- F1
F1 = 2 * Precision*Recall / (Precision + Recall)
# topk 召回F1
def calulate_F1(recall_items, true_items):
Recall = calulate_Recall(recall_items, true_items)
Precision = calulate_Precision(recall_items, true_items)
if Recall != 0 or Precision != 0:
return 2 * Precision * Recall / (Recall + Precision)

这篇博客介绍了如何计算模型的AUC(Area Under the Curve)、HR(Hit Rate)、Precision、Recall以及F1 Score。AUC通过比较正负样本对的排序来评估模型;HR关注的是在召回的Top K中正样本的比例;Precision衡量召回的K个样本中有多少是真正正样本;Recall则关注模型找回了多少正样本;而F1 Score是Precision和Recall的调和平均数,综合评估模型性能。代码示例展示了如何用Python实现这些指标的计算。
1681

被折叠的 条评论
为什么被折叠?



