统计学指标
Cohen’s Kappa coefficient
用途:Cohen’s Kappa coefficient用于度量两个rators对于同一事物二分类的一致性程度。
科研中,有些二分类任务的结果需要human evaluation作为groundtruth。这时候,两个author会对实验结果进行采样并判断是否分类正确,Cohen’s Kappa coefficient越高,代表他们对于结果一致认可的程度越高。
Cohen’s kappa coefficient计算公式
k=(p0−pe)/(1−pe)k = (p_0 - p_e) / (1 - p_e)k=(p0−pe)/(1−pe)
其中:
-
p0p_0p0 relative observed agreement among rators。
p0p_0p0 是所有打分中,两个rator打分一致的频次 -
pep_epe hypothetical probability of chance agreement
pep_epe 是根据观察,两个rator打分一致的概率
举个例子来说明:
博物馆有100个待展览的展品,两个管理员独立对它们进行分类,yes代表展出,no代表不展出。
他们的打分结果如下
rator1\rator2 | YES | No |
---|---|---|
Yes | 30 | 20 |
No | 15 | 35 |
计算过程如下:
- p0=(30+35)/100=0.65p_0 = (30 + 35) / 100 = 0.65p0=(30+35)/100=0.65 (意见一致的频率)
- pe=0.5∗0.45+0.5∗0.55=0.5p_e = 0.5 * 0.45 + 0.5 * 0.55 = 0.5pe=0.5∗0.45+0.5∗0.55=0.5 (根据观察,同时打yes或者同时打no的概率)
最终的Cohen’s Kappa coefficient得分为
(0.65−0.5)/(1−0.5)=0.3(0.65 - 0.5) / (1 - 0.5) = 0.3(0.65−0.5)/(1−0.5)=0.3
Cohen’s Kappa coefficient的参考分数
score | interpretation |
---|---|
≤0\le 0≤0 | no agreement |
(0,0.20](0,0.20](0,0.20] | none to slight |
(0.21,0.40](0.21,0.40](0.21,0.40] | fair |
(0.41,0.60](0.41,0.60](0.41,0.60] | moderate |
(0.61,0.80](0.61,0.80](0.61,0.80] | substantial |
(0.81,1.00](0.81,1.00](0.81,1.00] | perfect agreement |