t检验算法及其在R语言中的实现
常用统计方法-1:t检验,秩和检验和方差分析
1 定义
- t 检验法就是在假设检验时利用 t 分布进行概率计算的检验方法。
- 那问题来了,什么是 t 分布呢?自行百度吧
- 所以我们在进行 t 检验之前,应该对数据进行正态性检验以及方差齐性检验
2 单样本T检验
-
1)提出假设:
-2) 计算t
-
3)统计推断
-
4)在R中实现
- 单样本T检验
> data <- c(4.33,4.62,3.89,4.14,4.78,4.64,4.52,4.55,4.48,4.26)
> shapiro.test(data)
Shapiro-Wilk normality test
data: data
W = 0.95054, p-value = 0.6749
- p>0.05 属于正态分布
> mean(data)
[1] 4.421
> t.test(data,mu=4.5)
One Sample t-test
data: data
t = -0.93574, df = 9, p-value = 0.3738
alternative hypothesis: true mean is not equal to 4.5
95 percent confidence interval:
4.230016 4.611984
sample estimates:
mean of x
4.421
- p=0.3738>0.05,所以不能拒绝Ho,均值无显著差异。
- 后面举个实例。这个不容易懂。
3.方差齐的非配对的双样本 t 检验
-
提出假设
-
R中实现
-
t.test()
中非配对: paired = FALSE 方差齐: var.equal = T
#非配对两样本T检验
high<-c(134,146,106,119,124,161,107,83,113,129,97,123)
low<-c(70,118,101,85,107,132,94)
x <- c(high,low)
group <- c(rep("high",12),rep("low",7))
> shapiro.test(high) #正态性检验
Shapiro-Wilk normality test
data: high
W = 0.99112, p-value = 0.9999
> shapiro.test(low) #正态性检验
Shapiro-Wilk normality test
data: low
W = 0.99801, p-value = 0.9999
> bartlett.test(x~group)#方差齐性检验
Bartlett test of homogeneity of variances
data: x by group
Bartlett's K-squared = 0.0066764, df = 1,
p-value = 0.9349
#接近1表明方差齐
> t.test(high,low,paired = FALSE,var.equal = T)
#非配对: paired = FALSE 方差齐: var.equal = T
Two Sample t-test
data: high and low
t = 1.9157, df = 17, p-value = 0.07238
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.942543 40.275876
sample estimates:
mean of x mean of y
120.1667 101.0000
- p-value = 0.07238>0.05,所以不能否定Ho。
4. 方差不齐的非配对的 t 检验
- 提出假设
- R实现
> # 生成两组符合正态分布的数据
> data3 <- rnorm(100,3,5)
> data4 <- rnorm(200,3.4,8)
> ##方差齐性检验
> var.test(data3,data4)
F test to compare two variances
data: data3 and data4
F = 0.24738, num df = 99, denom df = 199, p-value = 5.306e-13
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.1773978 0.3519133
sample estimates:
ratio of variances
0.2473819
- 可以看到p值远<0.05,方差不齐(如果P值为>0.05,接受原假设,认为两者方差相同)
> #t检验
> t.test(data3,data4,var.equal = F)
Welch Two Sample t-test
data: data3 and data4
t = -2.4225, df = 298, p-value = 0.05601 #(我改的)
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.1367237 -0.3247594
sample estimates:
mean of x mean of y
2.524436 4.255178
- 可以看到p>0.05,接受H0,两组数据均值没有统计学差异
5.配对(非独立,受试对象前后状态) 双样本 t 检验
- 提出假设
- R实现
> ds <- c(82.5,85.2,87.6,89.9,89.4,90.1,87.8,87.0,88.5,92.4)
> cs <- c(91.7,94.2,93.3,97.0,96.4,91.5,97.2,96.2,98.5,95.8)
> t.test(ds,cs,paired = T,alternative = "two.sided",cond.lvel=0.95)
Paired t-test
data: ds and cs
t = -7.8601, df = 9, p-value = 2.548e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-9.1949 -5.0851
sample estimates:
mean of the differences
-7.14
-
p-value = 2.548e-05 < 0.01,所以否定Ho,接受HA
-
设置参数var.equal=TURE,指定样本之间是等方差的,也可以通过alternative=这个参数来指定单侧检验