假设有一个随机变量 X X X和一个观测变量 Y Y Y,线性最小均方估计(LMMSE)的目标是基于观测 Y Y Y对 X X X进行估计,即寻找一个线性估计 X ^ = a Y + b \hat{X} = aY + b X^=aY+b,使得估计值 X ^ \hat{X} X^与真实值 X X X之间的均方误差最小。
目标函数
首先,定义均方误差(Mean Squared Error, MSE)为:
M
S
E
=
E
[
(
X
−
X
^
)
2
]
=
E
[
(
X
−
(
a
Y
+
b
)
)
2
]
MSE = E[(X - \hat{X})^2] = E[(X - (aY + b))^2]
MSE=E[(X−X^)2]=E[(X−(aY+b))2]
求解最优参数
为了最小化 M S E MSE MSE,我们需要对 a a a和 b b b求偏导数,并将这些偏导数设置为零。
对 a a a求偏导数
∂ M S E ∂ a = 2 E [ ( X − a Y − b ) ( − Y ) ] = − 2 E [ ( X − a Y − b ) Y ] \frac{\partial MSE}{\partial a} = 2E[(X - aY - b)(-Y)] = -2E[(X - aY - b)Y] ∂a∂MSE=2E[(X−aY−b)(−Y)]=−2E[(X−aY−b)Y]
设
∂
M
S
E
∂
a
=
0
\frac{\partial MSE}{\partial a} = 0
∂a∂MSE=0,则有:
E
[
(
X
−
a
Y
−
b
)
Y
]
=
0
E[(X - aY - b)Y] = 0
E[(X−aY−b)Y]=0
展开并整理得:
E
[
X
Y
]
−
a
E
[
Y
2
]
−
b
E
[
Y
]
=
0
E[XY] - aE[Y^2] - bE[Y] = 0
E[XY]−aE[Y2]−bE[Y]=0
对 b b b求偏导数
∂ M S E ∂ b = 2 E [ ( X − a Y − b ) ( − 1 ) ] = − 2 E [ X − a Y − b ] \frac{\partial MSE}{\partial b} = 2E[(X - aY - b)(-1)] = -2E[X - aY - b] ∂b∂MSE=2E[(X−aY−b)(−1)]=−2E[X−aY−b]
设
∂
M
S
E
∂
b
=
0
\frac{\partial MSE}{\partial b} = 0
∂b∂MSE=0,则有:
E
[
X
−
a
Y
−
b
]
=
0
E[X - aY - b] = 0
E[X−aY−b]=0
展开并整理得:
E
[
X
]
−
a
E
[
Y
]
−
b
=
0
E[X] - aE[Y] - b = 0
E[X]−aE[Y]−b=0
解方程组
我们得到了两个方程:
1.
E
[
X
Y
]
−
a
E
[
Y
2
]
−
b
E
[
Y
]
=
0
E[XY] - aE[Y^2] - bE[Y] = 0
E[XY]−aE[Y2]−bE[Y]=0
2.
E
[
X
]
−
a
E
[
Y
]
−
b
=
0
E[X] - aE[Y] - b = 0
E[X]−aE[Y]−b=0
从第二个方程中解出
b
b
b:
b
=
E
[
X
]
−
a
E
[
Y
]
b = E[X] - aE[Y]
b=E[X]−aE[Y]
将
b
b
b代入第一个方程:
E
[
X
Y
]
−
a
E
[
Y
2
]
−
(
E
[
X
]
−
a
E
[
Y
]
)
E
[
Y
]
=
0
E[XY] - aE[Y^2] - (E[X] - aE[Y])E[Y] = 0
E[XY]−aE[Y2]−(E[X]−aE[Y])E[Y]=0
简化得:
E
[
X
Y
]
−
a
E
[
Y
2
]
−
E
[
X
]
E
[
Y
]
+
a
(
E
[
Y
]
)
2
=
0
E[XY] - aE[Y^2] - E[X]E[Y] + a(E[Y])^2 = 0
E[XY]−aE[Y2]−E[X]E[Y]+a(E[Y])2=0
进一步整理得:
E
[
X
Y
]
−
E
[
X
]
E
[
Y
]
=
a
(
E
[
Y
2
]
−
(
E
[
Y
]
)
2
)
E[XY] - E[X]E[Y] = a(E[Y^2] - (E[Y])^2)
E[XY]−E[X]E[Y]=a(E[Y2]−(E[Y])2)
注意到:
Cov
(
X
,
Y
)
=
E
[
X
Y
]
−
E
[
X
]
E
[
Y
]
\text{Cov}(X, Y) = E[XY] - E[X]E[Y]
Cov(X,Y)=E[XY]−E[X]E[Y]
Var
(
Y
)
=
E
[
Y
2
]
−
(
E
[
Y
]
)
2
\text{Var}(Y) = E[Y^2] - (E[Y])^2
Var(Y)=E[Y2]−(E[Y])2
因此,我们有:
Cov
(
X
,
Y
)
=
a
⋅
Var
(
Y
)
\text{Cov}(X, Y) = a \cdot \text{Var}(Y)
Cov(X,Y)=a⋅Var(Y)
解得:
a
=
Cov
(
X
,
Y
)
Var
(
Y
)
a = \frac{\text{Cov}(X, Y)}{\text{Var}(Y)}
a=Var(Y)Cov(X,Y)
再将
a
a
a代入
b
b
b的表达式:
b
=
E
[
X
]
−
a
E
[
Y
]
=
E
[
X
]
−
(
Cov
(
X
,
Y
)
Var
(
Y
)
)
E
[
Y
]
b = E[X] - aE[Y] = E[X] - \left(\frac{\text{Cov}(X, Y)}{\text{Var}(Y)}\right)E[Y]
b=E[X]−aE[Y]=E[X]−(Var(Y)Cov(X,Y))E[Y]
最终结果
综上所述,线性最小均方估计的最优参数为:
a
=
Cov
(
X
,
Y
)
Var
(
Y
)
a = \frac{\text{Cov}(X, Y)}{\text{Var}(Y)}
a=Var(Y)Cov(X,Y)
b
=
E
[
X
]
−
a
E
[
Y
]
b = E[X] - aE[Y]
b=E[X]−aE[Y]
因此,LMMSE估计为:
X
^
=
a
Y
+
b
=
Cov
(
X
,
Y
)
Var
(
Y
)
Y
+
(
E
[
X
]
−
Cov
(
X
,
Y
)
Var
(
Y
)
E
[
Y
]
)
\hat{X} = aY + b = \frac{\text{Cov}(X, Y)}{\text{Var}(Y)}Y + \left(E[X] - \frac{\text{Cov}(X, Y)}{\text{Var}(Y)}E[Y]\right)
X^=aY+b=Var(Y)Cov(X,Y)Y+(E[X]−Var(Y)Cov(X,Y)E[Y])