单次计算
μ=∑i=1nxin\mu = \frac{\sum_{i=1}^{n} x_i}{n}μ=n∑i=1nxi
σ2=∑i=1n(xi−μ)2n=∑i=1nxi2−2∑i=1nxiμ+nμ2n=∑i=1nxi2−nμ2n=∑i=1nxi2n−μ2
\begin{array}{ll}
\sigma^2 &= \frac{\sum_{i=1}^{n}(x_i - \mu)^2}{n} \\
&= \frac{\sum_{i=1}^{n} x_i^2 -2\sum_{i=1}^{n} x_i\mu + n\mu^2}{n} \\
&= \frac{\sum_{i=1}^{n} x_i^2 - n\mu^2}{n} \\
&= \frac{\sum_{i=1}^{n} x_i^2}{n} - \mu^2
\end{array}
σ2=n∑i=1n(xi−μ)2=n∑i=1nxi2−2∑i=1nxiμ+nμ2=n∑i=1nxi2−nμ2=n∑i=1nxi2−μ2
增量计算
| 指标 | 第一批次 | 第二批次 | 合并 |
|---|---|---|---|
| 总数 | n1n_1n1 | n2n_2n2 | n1+n2n_1+n_2n1+n2 |
| 均值 | μ1\mu_1μ1 | μ2\mu_2μ2 | n1μ1+n2μ2n1+n2\frac{n_1 \mu_1 + n_2\mu_2}{n_1 + n_2}n1+n2n1μ1+n2μ2 |
| 方差 | σ1\sigma_1σ1 | σ2\sigma_2σ2 | ? |
| ∑xi2\sum x_i^2∑xi2 | n1σ12+n1μ12n_1 \sigma_1^2 + n_1 \mu_1^2n1σ12+n1μ12 | n2σ22+n2μ22n_2 \sigma_2^2 + n_2 \mu_2^2n2σ22+n2μ22 | n1σ12+n1μ12+n2σ22+n2μ22n_1 \sigma_1^2 + n_1 \mu_1^2 + n_2 \sigma_2^2 + n_2 \mu_2^2n1σ12+n1μ12+n2σ22+n2μ22 |
σ2=∑i=1nxi2n−μ2=n1σ12+n1μ12+n2σ22+n2μ22n1+n2−(n1μ1+n2μ2n1+n2)2=(n1+n2)(n1σ12+n1μ12+n2σ22+n2μ22)−(n1μ1+n2μ2)2(n1+n2)2=n1σ12+n2σ22n1+n2+n1n2μ12+n1n2μ22−2n1n2μ1μ2(n1+n2)2=n1σ12+n2σ22n1+n2+n1n2(μ1−μ2)2(n1+n2)2
\begin{array}{ll}
\sigma^2 &= \frac{\sum_{i=1}^{n} x_i^2}{n} - \mu^2 \\
&= \frac{n_1 \sigma_1^2 + n_1 \mu_1^2 + n_2 \sigma_2^2 + n_2 \mu_2^2}{n_1+n_2} - (\frac{n_1 \mu_1 + n_2\mu_2}{n_1 + n_2})^2 \\
&= \frac{(n_1 + n_2)(n_1 \sigma_1^2 + n_1 \mu_1^2 + n_2 \sigma_2^2 + n_2 \mu_2^2) - (n_1 \mu_1 + n_2\mu_2)^2}{(n_1 + n_2)^2} \\
&= \frac{n_1 \sigma_1^2 + n_2 \sigma_2^2}{n_1 + n_2} + \frac{ n_1n_2\mu_1^2 + n_1n_2\mu_2^2 - 2n_1n_2\mu_1\mu_2}{(n_1 +n_2)^2} \\
&= \frac{n_1 \sigma_1^2 + n_2 \sigma_2^2}{n_1 + n_2} + \frac{ n_1n_2(\mu_1 - \mu_2)^2 }{(n_1 +n_2)^2}
\end{array}
σ2=n∑i=1nxi2−μ2=n1+n2n1σ12+n1μ12+n2σ22+n2μ22−(n1+n2n1μ1+n2μ2)2=(n1+n2)2(n1+n2)(n1σ12+n1μ12+n2σ22+n2μ22)−(n1μ1+n2μ2)2=n1+n2n1σ12+n2σ22+(n1+n2)2n1n2μ12+n1n2μ22−2n1n2μ1μ2=n1+n2n1σ12+n2σ22+(n1+n2)2n1n2(μ1−μ2)2
方差的增量来自均值漂移
2138

被折叠的 条评论
为什么被折叠?



