Let x be observable data and
p(x,z|θ)p(x|θ)=p(z|x,θ)
Take log on both sides:
⇒logp(x,z|θ)−logp(x|θ)=logp(z|x,θ)logp(x|θ)=logp(x,z|θ)−logp(z|x,θ)
Take conditional expectation with respect to z|θ′,x on both sides:
⇒ε[logp(x|θ)|θ′,x]=ε[logp(x,z|θ)|θ′,x]−ε[logp(z|x,θ)|θ′,x]logp(x|θ)=ε[logp(x,z|θ)|θ′,x]−ε[logp(z|x,θ)|θ′,x]
Choose
⇒θ(i+1)=argmaxθε[logp(x,z|θ)|θ(i),x]θ(i+1)=argmaxθ∑zp(z|θ(i),x)logp(x,z|θ)
Prove that p(x|θ(i)) is increasing as i increasing, i.e.,
- Because of the choice of θ(i+1), we have
ε[logp(x,z|θ(i+1))|θ(i),x]≥ε[logp(x,z|θ(i))|θ(i),x]
- We only need to show that
ε[logp(z|x,θ(i+1))|θ(i),x]≤ε[logp(z|x,θ(i+1))|θ(i),x]
This is true because of following.
If ε is taken with respect to p(x), we have ε[logp(x)]≥εlogp′(x), where p′(x) is any pdf (not identical as p(x)).
p.f.
⇒⇒⇒ε[logp′(x)p(x)]≤logε[p′(x)p(x)](by Jensen's inequality)ε[logp′(x)]−ε[logp(x)]≤log∫p′(x)p(x)⋅p(x)dx=1ε[logp′(x)]−ε[logp(x)]≤0ε[logp′(x)]≤ε[logp(x)]
p.s.
Jensen’s inequality:
For a convex function ϕ,
ε[ϕ(x)]≥ϕ(ε[x])
and let ϕ=−log,
ε[log(x)]≤log(ε[x])