均值、方差、协方差的增量计算

本文围绕均值、方差和协方差的计算展开。介绍了流式计算中,为满足只遍历一次数据的需求,给出均值、方差和协方差的增量计算方法;还阐述了分布式并行计算时的聚合计算方法,以及窗口函数计算中滑动窗口的计算方法,并对相关公式进行了推导证明。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1. 基本定义

对于一组数值 Xn=[x1,x2,...,xn]X_n = [x_1, x_2, ..., x_n]Xn=[x1,x2,...,xn],其均值为
Xn‾=1n∑i=1nxi\overline{X_n} = \frac{1}{n} \sum_{i=1}^n x_iXn=n1i=1nxi
方差 (variance) 为 σn2=1n∑i=1n(xi−Xn‾)2\sigma_n ^2= \frac{1}{n}\sum_{i=1}^n(x_i-\overline{X_n})^2σn2=n1i=1n(xiXn)2
结合另一组数值 Yn=[y1,y2,...,yn]Y_n=[y_1,y_2, ..., y_n]Yn=[y1,y2,...,yn],它们的协方差 (covariance) 为
cov(Xn,Yn)=1n∑i=1n(xi−Xn‾)(yi−Yn‾)cov(X_n, Y_n) = \frac{1}{n} \sum_{i=1}^n (x_i-\overline{X_n})(y_i-\overline{Y_n})cov(Xn,Yn)=n1i=1n(xiXn)(yiYn)

2. 流式计算需求

流式计算无法保留全量数据,因此要求只遍历数据一次就能算出这些值。均值的计算比较简单,直接遍历一次即可求和以及统计出数据总数n。方差的计算需要先遍历一次数据求出均值,再遍历一次才能算出方差,总共需要遍历两次,因此暴力算法不符合需求。协方差的计算同理。下面给出增量的计算方法,只需要遍历数据一次。

定义 Xn‾\overline{X_n}Xn 为数组 [x1,x2,...,xn][x_1, x_2, ..., x_n][x1,x2,...,xn] 的均值,Xn−1‾\overline{X_{n-1}}Xn1表示前n-1个数的均值。
定义 vnv_nvnnσn2n\sigma_n^2nσn2,即前n个数的方差的n倍,vn−1v_{n-1}vn1表示 (n−1)σn−12(n-1)\sigma_{n-1}^2(n1)σn12,即前n-1个数的方差的 n-1 倍。
当我们已知前 n-1 个数的均值 Xn−1‾\overline{X_{n-1}}Xn1vn−1v_{n-1}vn1时,给定新到的第n个数 xnx_nxn,新的均值和方差可如下计算:
Xn‾=Xn−1‾+xn−Xn−1‾nvn=vn−1+(xn−Xn−1‾)(xn−Xn‾) \overline{X_n}=\overline{X_{n-1}}+\frac{x_n-\overline{X_{n-1}}}{n}\\ v_n=v_{n-1}+(x_n-\overline{X_{n-1}})(x_n-\overline{X_n}) Xn=Xn1+nxnXn1vn=vn1+(xnXn1)(xnXn)对于协方差,定义 VnV_nVn 为 n 倍的 cov(Xn,Yn)cov(X_n, Y_n)cov(Xn,Yn) ,即前n对数的协方差的n倍。可如下增量计算:Vn=Vn−1+(xn−Xn‾)(yn−Yn−1‾)或=Vn−1+(xn−Xn−1‾)(yn−Yn‾)\begin{aligned} V_n&=V_{n-1}+(x_n-\overline{X_n})(y_n-\overline{Y_{n-1}})\\ 或&=V_{n-1}+(x_n-\overline{X_{n-1}})(y_n-\overline{Y_n})\end{aligned}Vn=Vn1+(xnXn)(ynYn1)=Vn1+(xnXn1)(ynYn)上面两式右边部分是相等的。文章最后有这些公式的推导证明。

3. 聚合计算需求

当数据量比较大,需要分布式并行计算时,要求能把多个分片的中间结果合并成最终结果。假设有两个分片,分别有 n 和 m 组数据,第一个分片为 [(x1,1,y1,1),(x1,2,y1,2),(x1,3,y1,3),...,(x1,n,y1,n)][(x_{1,1}, y_{1,1}), (x_{1,2}, y_{1,2}), (x_{1,3}, y_{1,3}), ..., (x_{1,n}, y_{1,n})][(x1,1,y1,1),(x1,2,y1,2),(x1,3,y1,3),...,(x1,n,y1,n)],第二个分片为 [(x2,1,y2,1),(x2,2,y2,2),(x2,3,y2,3),...,(x2,m,y2,m)][(x_{2,1}, y_{2,1}), (x_{2,2}, y_{2,2}), (x_{2,3}, y_{2,3}), ..., (x_{2,m}, y_{2,m})][(x2,1,y2,1),(x2,2,y2,2),(x2,3,y2,3),...,(x2,m,y2,m)]
Xn‾\overline{X_n}Xn[x1,1,x1,2,x1,3,...,x1,n][x_{1,1}, x_{1,2}, x_{1,3}, ...,x_{1,n}][x1,1,x1,2,x1,3,...,x1,n] 的均值,Yn‾\overline{Y_n}Yn[y1,1,y1,2,y1,3,...,y1,n][y_{1,1}, y_{1,2}, y_{1,3}, ...,y_{1,n}][y1,1,y1,2,y1,3,...,y1,n] 的均值,VnV_nVn 为 n 倍的 cov(Xn,Yn)cov(X_n, Y_n)cov(Xn,Yn),即 Vn=∑i=1n(xi−Xn‾)(yi−Yn‾) V_n = \sum_{i=1}^n (x_i-\overline{X_n})(y_i-\overline{Y_n}) Vn=i=1n(xiXn)(yiYn) Xm‾\overline{X_m}Xm[x2,1,x2,2,x2,3,...,x2,m][x_{2,1}, x_{2,2}, x_{2,3}, ...,x_{2,m}][x2,1,x2,2,x2,3,...,x2,m] 的均值,Ym‾\overline{Y_m}Ym[y2,1,y2,2,y2,3,...,y2,m][y_{2,1}, y_{2,2}, y_{2,3}, ...,y_{2,m}][y2,1,y2,2,y2,3,...,y2,m] 的均值,VmV_mVm 为 m 倍的 cov(Xm,Ym)cov(X_m, Y_m)cov(Xm,Ym). 现在要求 Xn+m‾\overline{X_{n+m}}Xn+mVn+mV_{n+m}Vn+m,可如下计算: Xn+m‾=nXn‾+mXm‾n+mVn+m=Vn+Vm+nmn+m(Xn‾−Xm‾)(Yn‾−Ym‾) \overline{X_{n+m}} = \frac{n\overline{X_n}+m\overline{X_m}}{n+m}\\ V_{n+m}=V_n+V_m+\frac{nm}{n+m}(\overline{X_n}-\overline{X_m})(\overline{Y_n}-\overline{Y_m})Xn+m=n+mnXn+mXmVn+m=Vn+Vm+n+mnm(XnXm)(YnYm)

4. 滑动窗口计算需求

在窗口函数的计算中,窗口滑动时会新增一个数据,也可能剔除一个最老的数据。新增数据可使用前面流式计算的方法来更新,剔除数据也是利用相同的公式。由于均值、方差、协方差的计算与数据顺序无关,假设要剔除的就是 xnx_nxn,于是
Xn−1‾=Xn‾−xn−Xn‾n−1vn−1=vn−(xn−Xn−1‾)(xn−Xn‾)Vn−1=Vn−(xn−Xn‾)(yn−Yn−1‾)或=Vn−(xn−Xn−1‾)(yn−Yn‾) \overline{X_{n-1}}=\overline{X_n}-\frac{x_n-\overline{X_n}}{n-1}\\ v_{n-1}=v_n-(x_n-\overline{X_{n-1}})(x_n-\overline{X_n})\\ \begin{aligned} V_{n-1}&=V_n-(x_n-\overline{X_n})(y_n-\overline{Y_{n-1}})\\ 或&=V_n-(x_n-\overline{X_{n-1}})(y_n-\overline{Y_n}) \end{aligned} Xn1=Xnn1xnXnvn1=vn(xnXn1)(xnXn)Vn1=Vn(xnXn)(ynYn1)=Vn(xnXn1)(ynYn)

5. 推导证明

5.1 均值的增量计算

Xn‾=1n∑i=1nxi=1n(xn+∑i=1n−1xi)=1n(xn+(n−1)Xn−1‾)=1n(nXn−1‾+xn−Xn−1‾)=Xn−1‾+xn−Xn−1‾n \overline{X_n}=\frac{1}{n} \sum_{i=1}^n x_i=\frac{1}{n} (x_n + \sum_{i=1}^{n-1} x_i) =\frac{1}{n} (x_n + (n-1)\overline{X_{n-1}})\\ =\frac{1}{n} (n\overline{X_{n-1}}+x_n-\overline{X_{n-1}}) =\overline{X_{n-1}}+\frac{x_n-\overline{X_{n-1}}}{n} Xn=n1i=1nxi=n1(xn+i=1n1xi)=n1(xn+(n1)Xn1)=n1(nXn1+xnXn1)=Xn1+nxnXn1

5.2 方差的增量计算

vn=∑i=1n(xi−Xn‾)2=∑i=1n−1(xi−Xn‾)2+(xn−Xn‾)2=∑i=1n−1(xi−Xn−1‾+Xn−1‾−Xn‾)2+(xn−Xn‾)2=∑i=1n−1(xi−Xn−1‾)2+2(Xn−1‾−Xn‾)∑i=1n−1(xi−Xn−1‾)+(n−1)(Xn−1‾−Xn‾)2+(xn−Xn‾)2 v_n=\sum_{i=1}^n(x_i-\overline{X_n})^2=\sum_{i=1}^{n-1}(x_i-\overline{X_n})^2+(x_n-\overline{X_n})^2\\ =\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}}+\overline{X_{n-1}}-\overline{X_n})^2+(x_n-\overline{X_n})^2\\ =\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}})^2+2(\overline{X_{n-1}}-\overline{X_n})\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}})+(n-1)(\overline{X_{n-1}}-\overline{X_n})^2+(x_n-\overline{X_n})^2 vn=i=1n(xiXn)2=i=1n1(xiXn)2+(xnXn)2=i=1n1(xiXn1+Xn1Xn)2+(xnXn)2=i=1n1(xiXn1)2+2(Xn1Xn)i=1n1(xiXn1)+(n1)(Xn1Xn)2+(xnXn)2上式第一项就是vn−1v_{n-1}vn1,第二项是0,因为
∑i=1n−1(xi−Xn−1‾)=∑i=1n−1xi−(n−1)Xn−1‾=(n−1)Xn−1‾−(n−1)Xn−1‾=0\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}})=\sum_{i=1}^{n-1}x_i-(n-1)\overline{X_{n-1}}=(n-1)\overline{X_{n-1}}-(n-1)\overline{X_{n-1}}=0 i=1n1(xiXn1)=i=1n1xi(n1)Xn1=(n1)Xn1(n1)Xn1=0于是 vn=vn−1+(n−1)(Xn−1‾−Xn‾)2+(xn−Xn‾)2v_n=v_{n-1}+(n-1)(\overline{X_{n-1}}-\overline{X_n})^2+(x_n-\overline{X_n})^2vn=vn1+(n1)(Xn1Xn)2+(xnXn)2

简化一下增量部分,注意到有
(n−1)(Xn−1‾−Xn‾)=(n−1)Xn−1‾−(n−1)Xn‾=nXn‾−xn−(n−1)Xn‾=Xn‾−xn(n-1)(\overline{X_{n-1}}-\overline{X_n})=(n-1)\overline{X_{n-1}}-(n-1)\overline{X_n}=n\overline{X_n}-x_n-(n-1)\overline{X_n}=\overline{X_n}-x_n (n1)(Xn1Xn)=(n1)Xn1(n1)Xn=nXnxn(n1)Xn=Xnxn于是
(n−1)(Xn−1‾−Xn‾)2+(xn−Xn‾)2=(n−1)(Xn−1‾−Xn‾)(Xn−1‾−Xn‾)+(xn−Xn‾)2=(Xn‾−xn)(Xn−1‾−Xn‾)+(xn−Xn‾)2=(xn−Xn‾)(Xn‾−Xn−1‾)+(xn−Xn‾)2=(xn−Xn‾)(xn−Xn‾+Xn‾−Xn−1‾)=(xn−Xn‾)(xn−Xn−1‾) (n-1)(\overline{X_{n-1}}-\overline{X_n})^2+(x_n-\overline{X_n})^2\\ \begin{aligned} &=(n-1)(\overline{X_{n-1}}-\overline{X_n})(\overline{X_{n-1}}-\overline{X_n})+(x_n-\overline{X_n})^2\\ &=(\overline{X_n}-x_n)(\overline{X_{n-1}}-\overline{X_n})+(x_n-\overline{X_n})^2\\ &=(x_n-\overline{X_n})(\overline{X_n}-\overline{X_{n-1}})+(x_n-\overline{X_n})^2\\ &=(x_n-\overline{X_n})(x_n-\overline{X_n}+\overline{X_n}-\overline{X_{n-1}})\\ &=(x_n-\overline{X_n})(x_n-\overline{X_{n-1}}) \end{aligned} (n1)(Xn1Xn)2+(xnXn)2=(n1)(Xn1Xn)(Xn1Xn)+(xnXn)2=(Xnxn)(Xn1Xn)+(xnXn)2=(xnXn)(XnXn1)+(xnXn)2=(xnXn)(xnXn+XnXn1)=(xnXn)(xnXn1)于是 vn=vn−1+(xn−Xn−1‾)(xn−Xn‾)v_n=v_{n-1}+(x_n-\overline{X_{n-1}})(x_n-\overline{X_n})vn=vn1+(xnXn1)(xnXn)

5.3 协方差的增量计算

思路与方差的证明相同(实际上方差就是协方差的特殊形式)
Vn=∑i=1n−1(xi−Xn‾)(yi−Yn‾)+(xn−Xn‾)(yn−Yn‾) V_n=\sum_{i=1}^{n-1}(x_i-\overline{X_n})(y_i-\overline{Y_n})+(x_n-\overline{X_n})(y_n-\overline{Y_n}) Vn=i=1n1(xiXn)(yiYn)+(xnXn)(ynYn)把第一项中的均值换成 Xn−1‾\overline{X_{n-1}}Xn1Yn−1‾\overline{Y_{n-1}}Yn1,先换 Xn‾\overline{X_n}Xn
∑i=1n−1(xi−Xn‾)(yi−Yn‾)=∑i=1n−1(xi−Xn−1‾+Xn−1‾−Xn‾)(yi−Yn‾)=∑i=1n−1(xi−Xn−1‾)(yi−Yn‾)+(Xn−1‾−Xn‾)∑i=1n−1(yi−Yn‾)=∑i=1n−1(xi−Xn−1‾)(yi−Yn−1‾+Yn−1‾−Yn‾)+(Xn−1‾−Xn‾)∑i=1n−1(yi−Yn‾)=∑i=1n−1(xi−Xn−1‾)(yi−Yn−1‾)+∑i=1n−1(xi−Xn−1‾)(Yn−1‾−Yn‾)+(Xn−1‾−Xn‾)∑i=1n−1(yi−Yn‾) \begin{aligned} \sum_{i=1}^{n-1}(x_i-\overline{X_n})(y_i-\overline{Y_n}) &=\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}}+\overline{X_{n-1}}-\overline{X_n})(y_i-\overline{Y_n})\\ &=\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}})(y_i-\overline{Y_n})+(\overline{X_{n-1}}-\overline{X_n})\sum_{i=1}^{n-1}(y_i-\overline{Y_n})\\ &=\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}})(y_i-\overline{Y_{n-1}}+\overline{Y_{n-1}}-\overline{Y_n})+(\overline{X_{n-1}}-\overline{X_n})\sum_{i=1}^{n-1}(y_i-\overline{Y_n})\\ &=\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}})(y_i-\overline{Y_{n-1}})+\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}})(\overline{Y_{n-1}}-\overline{Y_n})+(\overline{X_{n-1}}-\overline{X_n})\sum_{i=1}^{n-1}(y_i-\overline{Y_n}) \end{aligned} i=1n1(xiXn)(yiYn)=i=1n1(xiXn1+Xn1Xn)(yiYn)=i=1n1(xiXn1)(yiYn)+(Xn1Xn)i=1n1(yiYn)=i=1n1(xiXn1)(yiYn1+Yn1Yn)+(Xn1Xn)i=1n1(yiYn)=i=1n1(xiXn1)(yiYn1)+i=1n1(xiXn1)(Yn1Yn)+(Xn1Xn)i=1n1(yiYn)
上式第二项为0,因为∑i=1n−1(xi−Xn−1‾)=∑i=1n−1xi−(n−1)Xn−1‾=∑i=1n−1xi−∑i=1n−1xi=0\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}})=\sum_{i=1}^{n-1}x_i-(n-1)\overline{X_{n-1}}=\sum_{i=1}^{n-1}x_i-\sum_{i=1}^{n-1}x_i=0i=1n1(xiXn1)=i=1n1xi(n1)Xn1=i=1n1xii=1n1xi=0
第三项可简化为 (Xn−1‾−Xn‾)(Yn‾−yn)(\overline{X_{n-1}}-\overline{X_n})(\overline{Y_n}-y_n)(Xn1Xn)(Ynyn)
因为∑i=1n−1(yi−Yn‾)+yn−Yn‾=∑i=1nyi−nYn‾=0\sum_{i=1}^{n-1}(y_i-\overline{Y_n})+y_n-\overline{Y_n}=\sum_{i=1}^n{y_i}-n\overline{Y_n}=0i=1n1(yiYn)+ynYn=i=1nyinYn=0
所以有∑i=1n−1(yi−Yn‾)=Yn‾−yn\sum_{i=1}^{n-1}(y_i-\overline{Y_n})=\overline{Y_n}-y_ni=1n1(yiYn)=Ynyn
于是
∑i=1n−1(xi−Xn‾)(yi−Yn‾)=∑i=1n−1(xi−Xn−1‾)(yi−Yn−1‾)−(Xn−1‾−Xn‾)(yn−Yn‾)=Vn−1+(Xn‾−Xn−1‾)(yn−Yn‾) \sum_{i=1}^{n-1}(x_i-\overline{X_n})(y_i-\overline{Y_n})=\sum_{i=1}^{n-1}(x_i-\overline{X_{n-1}})(y_i-\overline{Y_{n-1}})-(\overline{X_{n-1}}-\overline{X_n})(y_n-\overline{Y_n})\\ =V_{n-1}+(\overline{X_n}-\overline{X_{n-1}})(y_n-\overline{Y_n}) i=1n1(xiXn)(yiYn)=i=1n1(xiXn1)(yiYn1)(Xn1Xn)(ynYn)=Vn1+(XnXn1)(ynYn)于是
Vn=Vn−1+(Xn‾−Xn−1‾)(yn−Yn‾)+(xn−Xn‾)(yn−Yn‾)=Vn−1+(xn−Xn‾+Xn‾−Xn−1‾)(yn−Yn‾)=Vn−1+(xn−Xn−1‾)(yn−Yn‾) \begin{aligned} V_n&=V_{n-1}+(\overline{X_n}-\overline{X_{n-1}})(y_n-\overline{Y_n})+(x_n-\overline{X_n})(y_n-\overline{Y_n})\\ &=V_{n-1}+(x_n-\overline{X_n}+\overline{X_n}-\overline{X_{n-1}})(y_n-\overline{Y_n})\\ &=V_{n-1}+(x_n-\overline{X_{n-1}})(y_n-\overline{Y_n}) \end{aligned} Vn=Vn1+(XnXn1)(ynYn)+(xnXn)(ynYn)=Vn1+(xnXn+XnXn1)(ynYn)=Vn1+(xnXn1)(ynYn)

5.4 协方差的聚合计算

Vn=∑i=1n(x1,i−Xn‾)(y1,i−Yn‾)Vm=∑i=1m(x2,i−Xm‾)(y2,i−Ym‾)Vn+m=∑i=1n(x1,i−Xn+m‾)(y1,i−Yn+m‾)+∑i=1m(x2,i−Xn+m‾)(y2,i−Yn+m‾) V_n = \sum_{i=1}^n (x_{1,i}-\overline{X_n})(y_{1,i}-\overline{Y_n})\\ V_m = \sum_{i=1}^m(x_{2,i}-\overline{X_m})(y_{2,i}-\overline{Y_m})\\ V_{n+m} = \sum_{i=1}^n(x_{1,i}-\overline{X_{n+m}})(y_{1,i}-\overline{Y_{n+m}})+\sum_{i=1}^m(x_{2,i}-\overline{X_{n+m}})(y_{2,i}-\overline{Y_{n+m}}) Vn=i=1n(x1,iXn)(y1,iYn)Vm=i=1m(x2,iXm)(y2,iYm)Vn+m=i=1n(x1,iXn+m)(y1,iYn+m)+i=1m(x2,iXn+m)(y2,iYn+m) 简化一下第一项,
∑i=1n(x1,i−Xn‾+Xn‾−Xn+m‾)(y1,i−Yn+m‾)=∑i=1n(x1,i−Xn‾)(y1,i−Yn+m‾)+(Xn‾−Xn+m‾)∑i=1n(y1,i−Yn+m‾)=∑i=1n(x1,i−Xn‾)(y1,i−Yn+m‾)+(Xn‾−Xn+m‾)(nYn‾−nYn+m‾)=∑i=1n(x1,i−Xn‾)(y1,i−Yn+m‾)+n(Xn‾−Xn+m‾)(Yn‾−Yn+m‾)=∑i=1n(x1,i−Xn‾)(y1,i−Yn‾+Yn‾−Yn+m‾)+n(Xn‾−Xn+m‾)(Yn‾−Yn+m‾)=∑i=1n(x1,i−Xn‾)(y1,i−Yn‾)+(Yn‾−Yn+m‾)∑i=1n(x1,i−Xn‾)+n(Xn‾−Xn+m‾)(Yn‾−Yn+m‾)=∑i=1n(x1,i−Xn‾)(y1,i−Yn‾)+n(Xn‾−Xn+m‾)(Yn‾−Yn+m‾)=Vn+n(Xn‾−Xn+m‾)(Yn‾−Yn+m‾) \sum_{i=1}^n(x_{1,i}-\overline{X_n}+\overline{X_n}-\overline{X_{n+m}})(y_{1,i}-\overline{Y_{n+m}})\\ =\sum_{i=1}^n(x_{1,i}-\overline{X_n})(y_{1,i}-\overline{Y_{n+m}}) + (\overline{X_n}-\overline{X_{n+m}})\sum_{i=1}^n(y_{1,i}-\overline{Y_{n+m}})\\ =\sum_{i=1}^n(x_{1,i}-\overline{X_n})(y_{1,i}-\overline{Y_{n+m}}) + (\overline{X_n}-\overline{X_{n+m}})(n\overline{Y_n}-n\overline{Y_{n+m}})\\ =\sum_{i=1}^n(x_{1,i}-\overline{X_n})(y_{1,i}-\overline{Y_{n+m}}) + n(\overline{X_n}-\overline{X_{n+m}})(\overline{Y_n}-\overline{Y_{n+m}})\\ =\sum_{i=1}^n(x_{1,i}-\overline{X_n})(y_{1,i}-\overline{Y_n}+\overline{Y_n}-\overline{Y_{n+m}}) + n(\overline{X_n}-\overline{X_{n+m}})(\overline{Y_n}-\overline{Y_{n+m}})\\ =\sum_{i=1}^n(x_{1,i}-\overline{X_n})(y_{1,i}-\overline{Y_n})+ (\overline{Y_n}-\overline{Y_{n+m}})\sum_{i=1}^n(x_{1,i}-\overline{X_n})+ n(\overline{X_n}-\overline{X_{n+m}})(\overline{Y_n}-\overline{Y_{n+m}})\\ =\sum_{i=1}^n(x_{1,i}-\overline{X_n})(y_{1,i}-\overline{Y_n})+ n(\overline{X_n}-\overline{X_{n+m}})(\overline{Y_n}-\overline{Y_{n+m}})\\ =V_n+n(\overline{X_n}-\overline{X_{n+m}})(\overline{Y_n}-\overline{Y_{n+m}}) i=1n(x1,iXn+XnXn+m)(y1,iYn+m)=i=1n(x1,iXn)(y1,iYn+m)+(XnXn+m)i=1n(y1,iYn+m)=i=1n(x1,iXn)(y1,iYn+m)+(XnXn+m)(nYnnYn+m)=i=1n(x1,iXn)(y1,iYn+m)+n(XnXn+m)(YnYn+m)=i=1n(x1,iXn)(y1,iYn+YnYn+m)+n(XnXn+m)(YnYn+m)=i=1n(x1,iXn)(y1,iYn)+(YnYn+m)i=1n(x1,iXn)+n(XnXn+m)(YnYn+m)=i=1n(x1,iXn)(y1,iYn)+n(XnXn+m)(YnYn+m)=Vn+n(XnXn+m)(YnYn+m) 同理,可将 Vn+mV_{n+m}Vn+m 表达式的第二项简化为
Vm+m(Xm‾−Xn+m‾)(Ym‾−Yn+m‾) V_m+m(\overline{X_m}-\overline{X_{n+m}})(\overline{Y_m}-\overline{Y_{n+m}}) Vm+m(XmXn+m)(YmYn+m)
注意到 n(Xn‾−Xn+m‾)+m(Xm‾−Xn+m‾)=nXn‾+mXm‾−(n+m)Xn+m‾=0n(\overline{X_n}-\overline{X_{n+m}}) + m(\overline{X_m}-\overline{X_{n+m}}) = n\overline{X_n}+m\overline{X_m}-(n+m)\overline{X_{n+m}}=0n(XnXn+m)+m(XmXn+m)=nXn+mXm(n+m)Xn+m=0
因此有 −n(Xn‾−Xn+m‾)=m(Xm‾−Xn+m‾)-n(\overline{X_n}-\overline{X_{n+m}})=m(\overline{X_m}-\overline{X_{n+m}})n(XnXn+m)=m(XmXn+m)
于是 Vn+m=Vn+Vm+n(Xn‾−Xn+m‾)(Yn‾−Yn+m‾−Ym‾+Yn+m‾)=Vn+Vm+n(Xn‾−Xn+m‾)(Yn‾−Ym‾) V_{n+m}=V_n+V_m+n(\overline{X_n}-\overline{X_{n+m}})(\overline{Y_n}-\overline{Y_{n+m}}-\overline{Y_m}+\overline{Y_{n+m}})\\ =V_n+V_m+n(\overline{X_n}-\overline{X_{n+m}})(\overline{Y_n}-\overline{Y_m}) Vn+m=Vn+Vm+n(XnXn+m)(YnYn+mYm+Yn+m)=Vn+Vm+n(XnXn+m)(YnYm)n(Xn‾−Xn+m‾)=nn+m((n+m)Xn‾−(n+m)Xn+m‾)=nn+m(nXn‾+mXn‾−∑i=1nx1,i−∑i=1mx2,i)=nn+m(∑i=1nx1,i+mXn‾−∑i=1nx1,i−mXm‾)=nmn+m(Xn‾−Xm‾) n(\overline{X_n}-\overline{X_{n+m}})=\frac{n}{n+m}((n+m)\overline{X_n}-(n+m)\overline{X_{n+m}})\\ =\frac{n}{n+m}(n\overline{X_n}+m\overline{X_n}-\sum_{i=1}^n{x_{1,i}}-\sum_{i=1}^m{x_{2,i}})\\ =\frac{n}{n+m}(\sum_{i=1}^n{x_{1,i}}+m\overline{X_n}-\sum_{i=1}^n{x_{1,i}}-m\overline{X_m})\\ =\frac{nm}{n+m}(\overline{X_n}-\overline{X_m}) n(XnXn+m)=n+mn((n+m)Xn(n+m)Xn+m)=n+mn(nXn+mXni=1nx1,ii=1mx2,i)=n+mn(i=1nx1,i+mXni=1nx1,imXm)=n+mnm(XnXm) 于是 Vn+m=Vn+Vm+nmn+m(Xn‾−Xm‾)(Yn‾−Ym‾) V_{n+m}=V_n+V_m+\frac{nm}{n+m}(\overline{X_n}-\overline{X_m})(\overline{Y_n}-\overline{Y_m})Vn+m=Vn+Vm+n+mnm(XnXm)(YnYm)

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值