如果
y
~
Bernoulli(ϕ)
x|y=1
~
N(μ0,Σ)
x|y=0
~
N(μ1,Σ)
则
p(y)=ϕy(1−ϕ)1−y
p(x|y=1)=1(2π)n2|Σ|12exp(−12(x−μ0)TΣ−1(x−μ0))=(1(2π)n2|Σ|12exp(−12(x−μ0)TΣ−1(x−μ0)))y
p(x|y=0)=1(2π)n2|Σ|12exp(−12(x−μ1)TΣ−1(x−μ1))=(1(2π)n2|Σ|12exp(−12(x−μ1)TΣ−1(x−μ1)))1−y
l(ϕ,μ0,μ1,Σ)=log∏i=1mp(x(i),y(i);ϕ,μ0,μ1,Σ)
=log∏i=1mp(x(i)|y(i);μ0,μ1,Σ)p(y(i);ϕ)
=log∏i=1m(1(2π)n2|Σ|12exp(−12(x(i)−μ0)TΣ−1(x(i)−μ0))ϕ)y(i)(1(2π)n2|Σ|12exp(−12(x(i)−μ1)TΣ−1(x(i)−μ1))(1−ϕ))1−y(i)
=∑i=1my(i)log(1(2π)n2|Σ|12exp(−12(x(i)−μ0)TΣ−1(x(i)−μ0))ϕ)+(1−y(i))log(1(2π)n2|Σ|12exp(−12(x(i)−μ1)TΣ−1(x(i)−μ1))(1−ϕ))
=∑i=1my(i)(−12(x(i)−μ0)TΣ−1(x(i)−μ0)+log(1(2π)n2|Σ|12)+log(ϕ))+(1−y(i))(−12(x(i)−μ1)TΣ−1(x(i)−μ1)+log(1(2π)n2|Σ|12)+log(1−ϕ))
=∑i=1mlog(1(2π)n2|Σ|12)+log(1−ϕ)+y(i)(log(ϕ)−log(1−ϕ))+y(i)(−12(x(i)−μ0)TΣ−1(x(i)−μ0))+(1−y(i))(−12(x(i)−μ1)TΣ−1(x(i)−μ1))
⇒
∂l∂ϕ=∑i=1m−11−ϕ+y(i)(1ϕ(1−ϕ))
=∑i=1m11−ϕ(−1+y(i)1ϕ)
=11−ϕ⎛⎝⎜−m+∑i=1m(y(i))ϕ⎞⎠⎟==0
⇒ϕ=∑i=1m(y(i))m
先证
d=(A−B)TC(A−B)
∇Bd=∇Btr(d)=∇Btr(−ATCB−BTCA+BTCB)=(−CTA−CA+CB+CTB)
fC==CT∇Bd=2CB−2CA