花书
5.4.4
5.4.4
5.4.4 中说均方误差( MSE )度量着估计
θ
^
\hat{\theta}
θ^ 和真实参数
θ
\theta
θ 之间平方误差的总体期望偏差,包含了偏差和方差。但没有进行证明,现将推导过程展示在下面
M
S
E
=
E
[
(
θ
^
−
θ
)
2
]
=
E
[
(
(
θ
^
−
E
(
θ
^
)
)
+
(
E
(
θ
^
)
−
θ
)
)
2
]
=
E
[
(
θ
^
−
E
(
θ
^
)
)
2
]
+
E
[
(
E
(
θ
^
)
−
θ
)
2
]
+
2
E
[
(
θ
^
−
E
(
θ
^
)
)
(
E
(
θ
^
)
−
θ
)
]
=
E
[
(
θ
^
−
E
(
θ
^
)
)
2
]
+
(
E
(
θ
^
)
−
θ
)
2
+
2
(
E
(
θ
^
)
−
E
(
θ
^
)
)
(
E
(
θ
^
)
−
θ
)
=
V
a
r
(
θ
^
)
+
B
i
a
s
(
θ
^
)
2
\begin{aligned} MSE &= \mathbb{E}[(\hat{\theta} - \theta)^2] \\ &= \mathbb{E}[\Big( \big(\hat{\theta} - \mathbb{E}(\hat{\theta}) \big) + \big(\mathbb{E}(\hat{\theta}) - \theta \big) \Big)^2] \\ &= \mathbb{E}[ \big(\hat{\theta} - \mathbb{E}(\hat{\theta}) \big)^2] + \mathbb{E}[ \big( \mathbb{E}(\hat{\theta}) - \theta \big)^2 ] + 2\mathbb{E}[\big( \hat{\theta} - \mathbb{E}(\hat{\theta}) \big) \big( \mathbb{E}(\hat{\theta}) - \theta \big)] \\ &= \mathbb{E}[ \big(\hat{\theta} - \mathbb{E}(\hat{\theta}) \big)^2] + \big( \mathbb{E}(\hat{\theta}) - \theta \big)^2 + 2\big( \mathbb{E}(\hat{\theta}) - \mathbb{E}(\hat{\theta}) \big) \big( \mathbb{E}(\hat{\theta}) - \theta \big)\\ &= Var(\hat{\theta}) + Bias(\hat{\theta})^2 \end{aligned}
MSE=E[(θ^−θ)2]=E[((θ^−E(θ^))+(E(θ^)−θ))2]=E[(θ^−E(θ^))2]+E[(E(θ^)−θ)2]+2E[(θ^−E(θ^))(E(θ^)−θ)]=E[(θ^−E(θ^))2]+(E(θ^)−θ)2+2(E(θ^)−E(θ^))(E(θ^)−θ)=Var(θ^)+Bias(θ^)2
注意,
θ
\theta
θ 和
E
(
θ
^
)
\mathbb{E}(\hat{\theta})
E(θ^) 是确定的(虽然需要利用样本预估),所以
E
(
θ
)
=
θ
,
E
[
E
(
θ
^
)
]
=
E
(
θ
^
)
\mathbb{E}(\theta) = \theta, \mathbb{E}[\mathbb{E}(\hat{\theta})] = \mathbb{E}(\hat{\theta})
E(θ)=θ,E[E(θ^)]=E(θ^) 。
Reference: Understanding the Bias-Variance Tradeoff