比较不同聚类的指标

本文详细介绍了Jaccard指数的概念及其在聚类分析中的应用,包括与Fowlkes和Mallows指数的关系,以及与其他评价指标的区别。重点讨论了如何使用Jaccard指数来评估聚类结果的有效性和一致性。
  • Jaccard Index
    The Jaccard index (also known as Jaccard similarity coefficient) between partitions A and B is defined as the size of the intersection divided by the size of the union of the sample sets:

    where a, b, c and d are the entries in the mismatch matrix.

     

  • Fowlkes and Mallows

    Another method for comparing clusters was proposed by Fowlkes and Mallows (1983) as an alternative for Rand index. The Fowlkes and Mallows index can be defined as
    Actually, the Wallace coefficient was derived from this index (Wallace, 1983) and therefore it can be rewritten as


  • Mirkin Metric
    This coefficient assumes null value for identical clusterings and positive values otherwise. It corresponds to the Hamming distance between the binary vector representation of each partition.
    It provides an alternative adjusted form of Rand index. However, unlike Hubert and Arabie's adjusted Rand (Hubert, 1985) it doesn't provide a correction for chance agreement. Meila (2005) also proposed a bounded version of this index:
  • NMI (normalized mutual information)
     

    where cA ( cB ) i s t he number of groups in the partition A ( B),Ci· ( C·j ) i s t he sum of elements of C in row i (column j),
    and N is the number of nodes.

    If A = B, t hen I ( A,B) = 1; if A and B are completely different, t hen I ( A,B) = 0.


(to be continued...)

转载于:https://www.cnblogs.com/answeryi/archive/2011/12/02/2271967.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值