相关性研究的非参数方法：Spearman的秩相关系数和Kendall tau秩相关系数

最新推荐文章于 2025-03-08 08:00:00 发布

翻译最新推荐文章于 2025-03-08 08:00:00 发布 · 9.5k 阅读

文章标签：

#null #c

概率统计专栏收录该内容

16 篇文章

订阅专栏

Source: http://www.r-bloggers.com/non-parametric-methods-for-the-study-of-the-correlation-spearmans-rank-correlation-coefficient-and-kendall-tau-rank-correlation-coefficient/

在前面的帖子中我们看到，如何用Pearson积矩相关系数（product-moment correlation coefficient）来研究符合Gaussian分布的变量之间的相关性。如果不能假设变量符合Gaussian分布，我们这里有两个非参数方法：Spearman's rho test 和Kendall's tau test.
---------------------------------------------------

例如，你想要研究各种类型机械的生产率和操作员在使用中的满意评分（这些数字都从1到10），值如下：

Productivity: 5, 7, 9, 9, 8, 6, 4, 8, 7, 7
Satisfaction: 6, 7, 4, 4, 8, 7, 3, 9, 5, 8

先开始使用Spearman的秩相关系数:

a <- c(5, 7, 9, 9, 8, 6, 4, 8, 7, 7)
b <- c(6, 7, 4, 4, 8, 7, 3, 9, 5, 8)

cor.test(a, b, method="spearman")

        Spearman's rank correlation rho

data:  a and b 
S = 145.9805, p-value = 0.7512
alternative hypothesis: true rho is not equal to 0 
sample estimates:
      rho 
0.1152698

统计检验给了我们结果为rho=0.115，这显示了两个数值集之间低的相关性。而p-value > 0.05允许我们接受null hypothesis H0：rho=0。

接下来我们使用相同数据来做Kendall tau秩相关系数的检测：

a <- c(5, 7, 9, 9, 8, 6, 4, 8, 7, 7)
b <- c(6, 7, 4, 4, 8, 7, 3, 9, 5, 8)
 
cor.test(a, b, method="kendall")

        Kendall's rank correlation tau

data:  a and b 
z = 0.5555, p-value = 0.5786
alternative hypothesis: true tau is not equal to 0 
sample estimates:
     tau 
0.146385

同样的，相关性非常低（tau=0.146），而显著有tau=0（p-value > 0.05）。