样本方差公式推导--为什么样本方差的分母是n-1

概要

因为使用n作为分母会导致方差被低估,将分母替换为n-1可以保证样本方差是一种无偏估计

理想情况

首先,我们假定随机变量XXX的数学期望μ\muμ是已知的,然而方差σ2{{\sigma }^{2}}σ2未知。如果我们得到一组随机变量XXX的样本{Xi,i=1,2,3...n}\left\{ {{X}_{i}},i=1,2,3...n \right\}{Xi,i=1,2,3...n}

在这个条件下,根据方差的定义我们有:

E[(Xi−μ)2]=σ2,∀i=1,…,nE\left[ {{\left( {{X}_{i}}-\mu \right)}^{2}} \right]={{\sigma }^{2}},\quad \forall i=1,\ldots ,nE[(Xiμ)2]=σ2,i=1,,n

由此可得:

E[1n∑i=1n(Xi−μ)2]=σ2E\left[ \frac{1}{n}\sum\limits_{i=1}^{n}{{{\left( {{X}_{i}}-\mu \right)}^{2}}} \right]={{\sigma }^{2}}E[n1i=1n(Xiμ)2]=σ2

因此,1n∑i=1n(Xi−μ)2\frac{1}{n}\sum\limits_{i=1}^{n}{{{\left( {{X}_{i}}-\mu \right)}^{2}}}n1i=1n(Xiμ)2是方差σ2{{\sigma }^{2}}σ2的一个无偏估计。此时,除的分母仍然是nnn

使用样本均值代替数学期望

现在,假定随机变量XXX的数学期望μ\muμ是未知的,我们使用样本数据来估计数学期望μ\muμ

Xˉ=1n∑i=1nXi\bar{X}=\frac{1}{n}\sum\limits_{i=1}^{n}{{{X}_{i}}}Xˉ=n1i=1nXi

如果我们直接使用上式,代替数学期望μ\muμ,则会导致低估方差,如下所示:

E(1n∑i=1n(Xi−Xˉ)2)=E(1n∑i=1n[(Xi−μ)+(μ−Xˉ)]2)=E(1n∑i=1n(Xi−μ)2+2n∑i=1n(Xi−μ)(μ−Xˉ)+1n∑i=1n(μ−Xˉ)2)=E(1n∑i=1n(Xi−μ)2+2(Xˉ−μ)(μ−Xˉ)+(μ−Xˉ)2)=E(1n∑i=1n(Xi−μ)2−(μ−Xˉ)2)≤E(1n∑i=1n(Xi−μ)2)=σ2\begin{array}{l} E\left(\frac{1}{n} \sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}\right)=E\left(\frac{1}{n} \sum_{i=1}^{n}\left[\left(X_{i}-\mu\right)+(\mu-\bar{X})\right]^{2}\right) \\ =E\left(\frac{1}{n} \sum_{i=1}^{n}\left(X_{i}-\mu\right)^{2}+\frac{2}{n} \sum_{i=1}^{n}\left(X_{i}-\mu\right)(\mu-\bar{X})+\frac{1}{n} \sum_{i=1}^{n}(\mu-\bar{X})^{2}\right) \\ =E\left(\frac{1}{n} \sum_{i=1}^{n}\left(X_{i}-\mu\right)^{2}+2(\bar{X}-\mu)(\mu-\bar{X})+(\mu-\bar{X})^{2}\right) \\ =E\left(\frac{1}{n} \sum_{i=1}^{n}\left(X_{i}-\mu\right)^{2}-(\mu-\bar{X})^{2}\right) \\ \leq E\left(\frac{1}{n} \sum_{i=1}^{n}\left(X_{i}-\mu\right)^{2}\right)=\sigma^{2} \end{array}E(n1i=1n(XiXˉ)2)=E(n1i=1n[(Xiμ)+(μXˉ)]2)=E(n1i=1n(Xiμ)2+n2i=1n(Xiμ)(μXˉ)+n1i=1n(μXˉ)2)=E(n1i=1n(Xiμ)2+2(Xˉμ)(μXˉ)+(μXˉ)2)=E(n1i=1n(Xiμ)2(μXˉ)2)E(n1i=1n(Xiμ)2)=σ2

(μ−Xˉ)2{{(\mu -\bar{X})}^{2}}(μXˉ)2项进行分析:
E((μ−Xˉ)2)=E((Xˉ−μ)2)=E((1n∑i=1nXi−μ)2)=E((1n∑i=1n(Xi−μ))2)\begin{array}{l} E\left((\mu-\bar{X})^{2}\right)=E\left((\bar{X}-\mu)^{2}\right) \\ =E\left(\left(\frac{1}{n} \sum_{i=1}^{n} X_{i}-\mu\right)^{2}\right) \\ =E\left(\left(\frac{1}{n} \sum_{i=1}^{n}\left(X_{i}-\mu\right)\right)^{2}\right) \end{array}E((μXˉ)2)=E((Xˉμ)2)=E((n1i=1nXiμ)2)=E((n1i=1n(Xiμ))2)
对多个独立随机变量,存在下述公式:
方差计算公式:
D(X)=E(X2)−[E(X)]2D(X)=E\left( {{X}^{2}} \right)-{{[E(X)]}^{2}}D(X)=E(X2)[E(X)]2
均值的均值:
E(X)=E(1n∑i=1nXi)=1nE(∑i=1nXi)=E(Xi)=Xˉ\begin{aligned} & E(X)=E\left( \frac{1}{n}\sum\limits_{i=1}^{n}{{{X}_{i}}} \right) \\ & =\frac{1}{n}E\left( \sum\limits_{i=1}^{n}{{{X}_{i}}} \right) \\ & =E\left( {{X}_{i}} \right) \\ & =\bar{X} \end{aligned}E(X)=E(n1i=1nXi)=n1E(i=1nXi)=E(Xi)=Xˉ
均值的方差:
D(Xˉ)=D(1n∑i=1nXi)=1n2D(∑i=1nXi)=1nD(Xi)\begin{aligned} D(\bar{X}) &=D\left(\frac{1}{n} \sum_{i=1}^{n} X_{i}\right) \\ &=\frac{1}{n^{2}} D\left(\sum_{i=1}^{n} X_{i}\right) \\ &=\frac{1}{n} D\left(X_{i}\right) \end{aligned}D(Xˉ)=D(n1i=1nXi)=n21D(i=1nXi)=n1D(Xi)
所以:
E((μ−Xˉ)2)=E((1n∑i=1n(Xi−μ))2)→A=1n∑i=1n(Xi−μ)E(A2)=D(A)−E(A)2→E(A)=01nD(Xi−μ)=1nD(Xi)=1nσ2\begin{aligned} & E\left( {{(\mu -\bar{X})}^{2}} \right)=E\left( {{\left( \frac{1}{n}\sum\limits_{i=1}^{n}{\left( {{X}_{i}}-\mu \right)} \right)}^{2}} \right) \\ & \xrightarrow{A=\frac{1}{n}\sum\limits_{i=1}^{n}{\left( {{X}_{i}}-\mu \right)}}E\left( {{A}^{2}} \right) \\ & =D\left( A \right)-E{{\left( A \right)}^{2}} \\ & \xrightarrow{E(A)=0}\frac{1}{n}D\left( {{X}_{i}}-\mu \right) \\ & =\frac{1}{n}D\left( {{X}_{i}} \right) \\ & =\frac{1}{n}{{\sigma }^{2}} \end{aligned}E((μXˉ)2)=E(n1i=1n(Xiμ))2A=n1i=1n(Xiμ)E(A2)=D(A)E(A)2E(A)=0n1D(Xiμ)=n1D(Xi)=n1σ2
结合以上结果,可以知道:
E(1n∑i=1n(Xi−Xˉ)2)=E(1n∑i=1n(Xi−μ)2−(μ−Xˉ)2)=E(1n∑i=1n(Xi−μ)2)−E((μ−Xˉ)2)=σ2−1nσ2=n−1nσ2\begin{aligned} & E\left( \frac{1}{n}\sum\limits_{i=1}^{n}{{{\left( {{X}_{i}}-\bar{X} \right)}^{2}}} \right)=E\left( \frac{1}{n}\sum\limits_{i=1}^{n}{{{\left( {{X}_{i}}-\mu \right)}^{2}}}-{{(\mu -\bar{X})}^{2}} \right) \\ & =E\left( \frac{1}{n}\sum\limits_{i=1}^{n}{{{\left( {{X}_{i}}-\mu \right)}^{2}}} \right)-E\left( {{(\mu -\bar{X})}^{2}} \right) \\ & ={{\sigma }^{2}}-\frac{1}{n}{{\sigma }^{2}} \\ & =\frac{n-1}{n}{{\sigma }^{2}} \end{aligned}E(n1i=1n(XiXˉ)2)=E(n1i=1n(Xiμ)2(μXˉ)2)=E(n1i=1n(Xiμ)2)E((μXˉ)2)=σ2n1σ2=nn1σ2
要使样本方差的期望等于总体方差,就需要进行修正,也即给样本方差乘上nn−1\frac{n}{n-1}n1n
所以得到样本方差为:
nn−1⋅1n∑i=1n(Xi−Xˉ)2=1n−1∑i=1n(Xi−Xˉ)2\frac{n}{n-1}\cdot \frac{1}{n}\sum\limits_{i=1}^{n}{{{\left( {{X}_{i}}-\bar{X} \right)}^{2}}}=\frac{1}{n-1}\sum\limits_{i=1}^{n}{{{\left( {{X}_{i}}-\bar{X} \right)}^{2}}}n1nn1i=1n(XiXˉ)2=n11i=1n(XiXˉ)2

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值