主题模型(2)

1.LDA推导
   (2)似然概率
    一个词 W m n \mathrm{W}_{\mathrm{mn}} Wmn初始化为一个词t的概率为
p ( w m , n = t ∣ ϑ ⃗ m , Φ ‾ ) = ∑ k = 1 K p ( w m , n = t ∣ φ ⃗ k ) p ( z m , n = k ∣ ϑ ⃗ m ) p\left(w_{m, n}=t | \vec{\vartheta}_{m}, \underline{\Phi}\right)=\sum_{k=1}^{K} p\left(w_{m, n}=t | \vec{\varphi}_{k}\right) p\left(z_{m, n}=k | \vec{\vartheta}_{m}\right) p(wm,n=tϑ m,Φ)=k=1Kp(wm,n=tφ k)p(zm,n=kϑ m)
    每个文档中出现主题k的概率乘以主题k下出现词t的概率,然后枚举所有主题求和得到,整个文档集合的似然函数为:
p ( W ∣ Θ ‾ , Φ ‾ ) = ∏ m = 1 M p ( w ⃗ m ∣ ϑ ⃗ m , Φ ‾ ) = ∏ m = 1 M ∏ n = 1 N m p ( w m , n ∣ ϑ ⃗ m , Φ ‾ ) p(\mathcal{W} | \underline{\Theta}, \underline{\Phi})=\prod_{m=1}^{M} p\left(\vec{w}_{m} | \vec{\vartheta}_{m}, \underline{\Phi}\right)=\prod_{m=1}^{M} \prod_{n=1}^{N_{m}} p\left(w_{m, n} | \vec{\vartheta}_{m}, \underline{\Phi}\right) p(WΘ,Φ)=m=1Mp(w mϑ m,Φ)=m=1Mn=1Nmp(wm,nϑ m,Φ)
   (3)Gibbs 采样
     每次选取概率向量的一个维度,给定其他维度的变量值采样当前维度的值,不断迭代输出待估计的参数;
     初始时随机给文本中每个词分配主题 Z ( 0 ) Z^{(0)} Z(0),然后统计每个主题z下出现词t的数量以及每个文档m下出现主题z的数量,每一轮计算 p ( z i ∣ z − i , d , w ) \mathrm{p}\left(\mathrm{z}_{\mathrm{i}} | \mathrm{z}_{-\mathrm{i}}, \mathrm{d}, \mathrm{w}\right) p(zizi,d,w),即排除当前词的主题分布:根据所有其他词的主题分布估计当前词分配各个主题的概率;
     当得到当前属于所有主题的z的概率分布以后,根据这个概率分布为该词采样一个新的主题;
     用同样的方法更新下一个词的主题,直到发现每个文档的主题分布 θ i \theta_i θi和每个主题的词分布 φ j \varphi_{j} φj收敛,算法停止,输出待估计的参数 θ \theta θ ϕ \phi ϕ,同时每个单词的主题分布 Z mn ⁡ Z_{\operatorname{mn}} Zmn也得到;
     实际过程中,会设置最大迭代次数。每一次计算 p ( z i ∣ z − i , d , w ) \mathrm{p}\left(\mathrm{z}_{\mathrm{i}} | \mathrm{z}_{-\mathrm{i}}, \mathrm{d}, \mathrm{w}\right) p(zizi,d,w)公式称为Gibbs Updating Rule。
   (4)联合分布
p ( w ⃗ , z ⃗ ∣ α ⃗ , β ⃗ ) = p ( w ⃗ ∣ z ⃗ , β ⃗ ) p ( z ⃗ ∣ α ⃗ ) p(\vec{w}, \vec{z} | \vec{\alpha}, \vec{\beta})=p(\vec{w} | \vec{z}, \vec{\beta}) p(\vec{z} | \vec{\alpha}) p(w ,z α ,β )=p(w z ,β )p(z α )
p ( w ⃗ ∣ z ⃗ , β ⃗ ) = ∫ p ( w ⃗ ∣ z ⃗ , Φ ‾ ) p ( Φ ‾ ∣ β ⃗ ) d Φ ‾ ( β ^ ) = ∫ ∏ z = 1 K 1 Δ ( β ⃗ ) ∏ t = 1 V φ z , t n z ( t ) + β t − 1 d φ ⃗ z = ∏ z = 1 K Δ ( n ⃗ z + β ⃗ ) Δ ( β ⃗ ) , n ⃗ z = { n z ( t ) } t = 1 V \begin{aligned} p(\vec{w} | \vec{z}, \vec{\beta}) &=\int p(\vec{w} | \vec{z}, \underline{\Phi}) p(\underline{\Phi} | \vec{\beta}) \mathrm{d} \underline{\Phi}^{(\hat{\beta})} \\ &=\int \prod_{z=1}^{K} \frac{1}{\Delta(\vec{\beta})} \prod_{t=1}^{V} \varphi_{z, t}^{n_{z}^{(t)}+\beta_{t}-1} \mathrm{d} \vec{\varphi}_{z} \\ &=\prod_{z=1}^{K} \frac{\Delta\left(\vec{n}_{z}+\vec{\beta}\right)}{\Delta(\vec{\beta})}, \quad \vec{n}_{z}=\left\{n_{z}^{(t)}\right\}_{t=1}^{V} \end{aligned} p(w z ,β )=p(w z ,Φ)p(Φβ )dΦ(β^)=z=1KΔ(β )1t=1Vφz,tnz(t)+βt1dφ z=z=1KΔ(β )Δ(n z+β ),n z={nz(t)}t=1V
p ( z ⃗ ∣ α ⃗ ) = ∫ p ( z ⃗ ∣ Θ ‾ ) p ( Θ ‾ ∣ α ⃗ ) d Θ ‾ = ∫ ∏ m = 1 M 1 Δ ( α ⃗ ) ∏ k = 1 K ϑ m , k n m ( k ) + α k − 1 d ϑ ⃗ m = ∏ m = 1 M Δ ( n ⃗ m + α ⃗ ) Δ ( α ⃗ ) , n ⃗ m = { n m ( k ) } k = 1 K \begin{aligned} p(\vec{z} | \vec{\alpha}) &=\int p(\vec{z} | \underline{\Theta}) p(\underline{\Theta} | \vec{\alpha}) \mathrm{d} \underline{\Theta} \\ &=\int \prod_{m=1}^{M} \frac{1}{\Delta(\vec{\alpha})} \prod_{k=1}^{K} \vartheta_{m, k}^{n_{m}^{(k)}+\alpha_{k}-1} \mathrm{d} \vec{\vartheta}_{m} \\ &=\prod_{m=1}^{M} \frac{\Delta\left(\vec{n}_{m}+\vec{\alpha}\right)}{\Delta(\vec{\alpha})}, \quad \vec{n}_{m}=\left\{n_{m}^{(k)}\right\}_{k=1}^{K} \end{aligned} p(z α )=p(z Θ)p(Θα )dΘ=m=1MΔ(α )1k=1Kϑm,knm(k)+αk1dϑ m=m=1MΔ(α )Δ(n m+α ),n m={nm(k)}k=1K
   (4)Gibbs Updating Rule
p ( z i = k ∣ z ⃗ ¬ i , w ⃗ ) = p ( w ⃗ , z ⃗ ) p ( w ⃗ , z ⃗ ¬ i ) = p ( w ⃗ ∣ z ⃗ ) p ( w ⃗ ¬ i ∣ z ⃗ ¬ i ) p ( w i ) ⋅ p ( z ⃗ ) p ( z ⃗ ¬ i ) ∝ Δ ( n ⃗ z + β ⃗ ) Δ ( n ⃗ z , i + β ⃗ ) ⋅ Δ ( n ⃗ m + α ⃗ ) Δ ( n ⃗ m , − i + α ⃗ ) = Γ ( n k ( t ) + β t ) Γ ( ∑ t = 1 V n k , ¬ i ( t ) + β t ) Γ ( n k , − i ( t ) + β t ) Γ ( ∑ t = 1 V n k ( t ) + β t ) ⋅ Γ ( n m ( k ) + α k ) Γ ( ∑ k = 1 K n m , − i ( k ) + α k ) Γ ( n m , ¬ i ( k ) + α k ) Γ ( ∑ k = 1 K n m ( k ) + α k ) = n k , − i ( t ) + β t ∑ t = 1 V n k , + i ( t ) + β t ⋅ n m , − i ( k ) + α k [ ∑ k = 1 K n m ( k ) + α k ] − 1 ∝ n k , ¬ i ( t ) + β t ∑ t = 1 V n k , ¬ i ( t ) + β t ( n m , − i ( k ) + α k ) \begin{aligned} p\left(z_{i}=k | \vec{z}_{\neg i}, \vec{w}\right)=& \frac{p(\vec{w}, \vec{z})}{p\left(\vec{w}, \vec{z}_{\neg i}\right)}=\frac{p(\vec{w} | \vec{z})}{p\left(\vec{w}_{\neg i} | \vec{z}_{\neg i}\right) p\left(w_{i}\right)} \cdot \frac{p(\vec{z})}{p\left(\vec{z}_{\neg i}\right)} \\ & \propto \frac{\Delta\left(\vec{n}_{z}+\vec{\beta}\right)}{\Delta\left(\vec{n}_{z, i}+\vec{\beta}\right)} \cdot \frac{\Delta\left(\vec{n}_{m}+\vec{\alpha}\right)}{\Delta\left(\vec{n}_{m,-i}+\vec{\alpha}\right)} \\ &=\frac{\Gamma\left(n_{k}^{(t)}+\beta_{t}\right) \Gamma\left(\sum_{t=1}^{V} n_{k, \neg i}^{(t)}+\beta_{t}\right)}{\Gamma\left(n_{k,-i}^{(t)}+\beta_{t}\right) \Gamma\left(\sum_{t=1}^{V} n_{k}^{(t)}+\beta_{t}\right)} \cdot \frac{\Gamma\left(n_{m}^{(k)}+\alpha_{k}\right) \Gamma\left(\sum_{k=1}^{K} n_{m,-i}^{(k)}+\alpha_{k}\right)}{\Gamma\left(n_{m, \neg i}^{(k)}+\alpha_{k}\right) \Gamma\left(\sum_{k=1}^{K} n_{m}^{(k)}+\alpha_{k}\right)} \\ &=\frac{n_{k,-i}^{(t)}+\beta_{t}}{\sum_{t=1}^{V} n_{k,+i}^{(t)}+\beta_{t}} \cdot \frac{n_{m,-i}^{(k)}+\alpha_{k}}{\left[\sum_{k=1}^{K} n_{m}^{(k)}+\alpha_{k}\right]-1} \\ & \propto \frac{n_{k, \neg i}^{(t)}+\beta_{t}}{\sum_{t=1}^{V} n_{k, \neg i}^{(t)}+\beta_{t}}\left(n_{m,-i}^{(k)}+\alpha_{k}\right) \end{aligned} p(zi=kz ¬i,w )=p(w ,z ¬i)p(w ,z )=p(w ¬iz ¬i)p(wi)p(w z )p(z ¬i)p(z )Δ(n z,i+β )Δ(n z+β )Δ(n m,i+α )Δ(n m+α )=Γ(nk,i(t)+βt)Γ(t=1Vnk(t)+βt)Γ(nk(t)+βt)Γ(t=1Vnk,¬i(t)+βt)Γ(nm,¬i(k)+αk)Γ(k=1Knm(k)+αk)Γ(nm(k)+αk)Γ(k=1Knm,i(k)+αk)=t=1Vnk,+i(t)+βtnk,i(t)+βt[k=1Knm(k)+αk]1nm,i(k)+αkt=1Vnk,¬i(t)+βtnk,¬i(t)+βt(nm,i(k)+αk)
   (5)词分布和主题分布
φ k , t = n k ( t ) + β t ∑ t = 1 V n k ( t ) + β t ϑ m , k = n m ( k ) + α k ∑ k = 1 K n m ( k ) + α k \begin{aligned} \varphi_{k, t} &=\frac{n_{k}^{(t)}+\beta_{t}}{\sum_{t=1}^{V} n_{k}^{(t)}+\beta_{t}} \\ \vartheta_{m, k} &=\frac{n_{m}^{(k)}+\alpha_{k}}{\sum_{k=1}^{K} n_{m}^{(k)}+\alpha_{k}} \end{aligned} φk,tϑm,k=t=1Vnk(t)+βtnk(t)+βt=k=1Knm(k)+αknm(k)+αk
p ( ϑ ⃗ m ∣ x ⃗ m , α ⃗ ) = 1 Z ϑ m ∏ n = 1 N m p ( z m , n ∣ ϑ ⃗ m ) ⋅ p ( ϑ ⃗ m ∣ α ⃗ ) = Dir ⁡ ( ϑ ⃗ m ∣ n ⃗ m + α ⃗ ) p ( φ ⃗ k ∣ z ⃗ , w ⃗ , β ⃗ ) = 1 Z φ k ∏ { i : z i = k } p ( w i ∣ φ ⃗ k ) ⋅ p ( φ ⃗ k ∣ β ⃗ ) = Dir ⁡ ( φ ⃗ k ∣ n ⃗ k + β ⃗ ) \begin{aligned} p\left(\vec{\vartheta}_{m} | \vec{x}_{m}, \vec{\alpha}\right) &=\frac{1}{Z_{\vartheta_{m}}} \prod_{n=1}^{N_{m}} p\left(z_{m, n} | \vec{\vartheta}_{m}\right) \cdot p\left(\vec{\vartheta}_{m} | \vec{\alpha}\right)=\operatorname{Dir}\left(\vec{\vartheta}_{m} | \vec{n}_{m}+\vec{\alpha}\right) \\ p\left(\vec{\varphi}_{k} | \vec{z}, \vec{w}, \vec{\beta}\right) &=\frac{1}{Z_{\varphi_{k}}} \prod_{\left\{i: z_{i}=k\right\}} p\left(w_{i} | \vec{\varphi}_{k}\right) \cdot p\left(\vec{\varphi}_{k} | \vec{\beta}\right)=\operatorname{Dir}\left(\vec{\varphi}_{k} | \vec{n}_{k}+\vec{\beta}\right) \end{aligned} p(ϑ mx m,α )p(φ kz ,w ,β )=Zϑm1n=1Nmp(zm,nϑ m)p(ϑ mα )=Dir(ϑ mn m+α )=Zφk1{i:zi=k}p(wiφ k)p(φ kβ )=Dir(φ kn k+β )

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值