X: 需要求解的参数;
Y: 已知的观测值;
Maximum A-Posterior (MAP):
X
^
=
a
r
g
m
a
x
X
l
n
p
(
X
∣
Y
)
.
\hat{X} = {\rm argmax}_{\ X} \ {\rm ln}\ p(X|Y).
X^=argmax X ln p(X∣Y).
根据Bayes’ theorem
X
^
=
a
r
g
m
a
x
X
{
l
n
p
(
Y
∣
X
)
+
l
n
p
(
X
)
}
.
\hat{X} = {\rm argmax}_{\ X} \ \{ {\rm ln}\ p(Y|X) + {\rm ln}\ p(X) \}.
X^=argmax X {ln p(Y∣X)+ln p(X)}.
When we assume
l
n
p
(
Y
∣
X
)
{\rm ln}\ p(Y|X)
ln p(Y∣X) is Gaussian, we have
l
n
p
(
Y
∣
X
)
=
−
l
n
σ
−
1
2
l
n
2
π
−
1
2
∣
∣
Y
−
Y
^
∣
∣
2
σ
2
.
{\rm ln}\ p(Y|X) = -{\rm ln}\ \sigma - \frac{1}{2}{\rm ln}\ 2\pi - \frac{1}{2}\frac{||Y - \hat{Y}||^2}{\sigma^2}.
ln p(Y∣X)=−ln σ−21ln 2π−21σ2∣∣Y−Y^∣∣2.
其中
Y
^
\hat{Y}
Y^为预测值(If the task is denoising,
Y
^
=
X
\hat{Y} = X
Y^=X; if the task is optimize the parameter of the nerual net (X),
Y
^
=
X
T
Y
\hat{Y} = X^{T}Y
Y^=XTY, etc)。