HMM的概率计算问题

HMM的概率计算问题

HMM的概率计算问题是指,给定模型参数λ=(A,B,π)\lambda = (A,B,\pi)λ=(A,B,π) 和观测序列O=(o1,o2,...,oT)O = (o_1,o_2,...,o_T)O=(o1,o2,...,oT),计算在模型λ\lambdaλ下,观测序列OOO出现的概率:P(O∣λ)P(O | \lambda)P(Oλ)


直接计算

按概率公式直接计算,在贝叶斯框架下有:

P(O∣λ)=∑IP(O,I∣λ)=∑IP(O∣I,λ)P(I∣λ)P(O | \lambda) = \sum_{I} P(O,I | \lambda) = \sum_{I} P(O | I,\lambda)P(I | \lambda)P(Oλ)=IP(O,Iλ)=IP(OI,λ)P(Iλ)

  • 其中,P(O∣I,λ)P(O | I,\lambda)P(OI,λ)是从it→oti_t \to o_titot,由发射概率矩阵[bj(k)]N×M[b_j(k)]_{N \times M}[bj(k)]N×M中获得:

P(O∣I,λ)=P(o1∣i1)...P(ot∣it)...P(oT∣iT)=bi1(o1)...bit(ot)...biT(oT)P(O | I,\lambda) =P(o_1 | i_1)...P(o_t | i_t)...P(o_T | i_T) = b_{i_1}(o_1)...b_{i_t}(o_t)...b_{i_T}(o_T)P(OI,λ)=P(o1i1)...P(otit)...P(oTiT)=bi1(o1)...bit(ot)...biT(oT),共TTT

  • P(I∣λ)P(I | \lambda)P(Iλ)是从it−1→iti_{t-1} \to i_tit1it,由转移概率矩阵[aij]N×N[a_{ij}]_{N \times N}[aij]N×N和初始状态概率向量π\piπ获得:

P(I∣λ)=πi1P(i2∣i1)...P(it∣it−1)...P(iT∣iT−1)=πi1ai1i2...ait−1it...aiT−1iTP(I | \lambda) = \pi_{i_1}P(i_2 | i_1) ...P(i_t | i_{t-1})...P(i_T | i_{T-1}) = \pi_{i_1} a_{i_1 i_2}...a_{i_{t-1} i_t}...a_{i_{T-1} i_T}P(Iλ)=πi1P(i2i1)...P(itit1)...P(iTiT1)=πi1ai1i2...ait1it...aiT1iT,共TTT

两式代入计算得:

  • P(O∣λ)=∑IP(O,I∣λ)P(O | \lambda) = \sum_{I} P(O,I | \lambda)P(Oλ)=IP(O,Iλ)

=∑IP(O∣I,λ)P(I∣λ)= \sum_{I} P(O | I,\lambda)P(I | \lambda)=IP(OI,λ)P(Iλ)

=∑I[bi1(o1)...bit(ot)...biT(oT)]×[πi1ai1i2...ait−1it...aiT−1iT]= \sum_{I} [b_{i_1}(o_1)...b_{i_t}(o_t)...b_{i_T}(o_T)] \times [\pi_{i_1} a_{i_1 i_2}...a_{i_{t-1} i_t}...a_{i_{T-1} i_T}]=I[bi1(o1)...bit(ot)...biT(oT)]×[πi1ai1i2...ait1it...aiT1iT]

=∑Iπi1∏t=1Tbit(ot)∏t=1T−1aitit+1= \sum_{I} \pi_{i_1} \prod_{t=1}^T b_{i_t}(o_t) \prod_{t=1}^{T-1}a_{i_t i_{t+1}}=Iπi1t=1Tbit(ot)t=1T1aitit+1

由于∑I=∑i1...∑it...∑iT\sum_{I} = \sum_{i_1}...\sum_{i_t}...\sum_{i_T}I=i1...it...iT,每个iti_titNNN种取值可能,故∑I\sum_{I}I共有NTN^TNT项,可知若按概率公式直接计算P(O∣λ)P(O | \lambda)P(Oλ),计算量会很大。


前向算法(Forward Algorithm)

找出从时刻1→...→t→...→T1 \to ... \to t \to ... \to T1...t...T,前向概率的递归关系:

前向概率

在观测时间点1,...,t,...,T1,...,t,...,T1,...,t,...,T上,对应的观测值为o1,...,ot,...,oTo_1,...,o_t,...,o_To1,...,ot,...,oT,各隐状态分别为i1,...,it,...,iTi_1,...,i_t,...,i_Ti1,...,it,...,iT

i1→...→it→...→iTi_1 \to ... \to i_t \to ...\to i_Ti1...it...iTo1→...→ot→...→oTo_1 \to ... \to o_t \to ...\to o_To1...ot...oT

定义前向概率αt(i)=P(o1,...,ot,it=qi∣λ)\alpha_t(i) = P(o_1,...,o_t,i_t = q_i | \lambda)αt(i)=P(o1,...,ot,it=qiλ)

它表示:截止到时刻ttt,观测序列的值为o1,o2,...,oto_1,o_2,...,o_to1,o2,...,ot、且ttt时刻的状态为qiq_iqi的概率。

递归过程的公式推导

根据定义,写出t=1t=1t=1t=2t=2t=2的前向概率:

  • α1(i)=P(o1,i1=qi∣λ)=P(o1∣i1=qi,λ)P(i1=qi∣λ)=bi(o1)πi\alpha_1(i) = P(o_1,i_1 = q_i | \lambda) = P(o_1 | i_1 = q_i, \lambda)P(i_1 = q_i | \lambda) = b_{i}(o_1) \pi_iα1(i)=P(o1,i1=qiλ)=P(o1i1=qi,λ)P(i1=qiλ)=bi(o1)πi

  • α2(j)=P(o1,o2,i2=qj∣λ)\alpha_2(j) = P(o_1,o_2,i_2 = q_j | \lambda)α2(j)=P(o1,o2,i2=qjλ)
    $= \sum_{i=1}^N P(o_1,o_2,i_1 = q_i,i_2 = q_j | \lambda) $
    =∑i=1NP(o2∣i2=qj,λ)P(i2=qj∣i1=qi,λ)P(o1∣i1=qi,λ)P(i1=qi∣λ)= \sum_{i=1}^N P(o_2 | i_2 = q_j,\lambda)P(i_2 = q_j | i_1 = q_i,\lambda)P(o_1 | i_1 = q_i,\lambda) P(i_1 = q_i | \lambda)=i=1NP(o2i2=qj,λ)P(i2=qji1=qi,λ)P(o1i1=qi,λ)P(i1=qiλ)
    =∑i=1Nbj(o2)aijα1= \sum_{i=1}^N b_j(o_2) a_{ij} \alpha_1=i=1Nbj(o2)aijα1
    =bj(o2)∑i=1Naijα1(i)= b_j(o_2) \sum_{i=1}^N a_{ij} \alpha_1(i)=bj(o2)i=1Naijα1(i)

.........

递推得到αt+1(j)\alpha_{t+1}(j)αt+1(j)αt(i)\alpha_t(i)αt(i)之间的关系:

αt+1(j)=bj(ot+1)∑i=1Naijαt(i)\alpha_{t+1}(j) = b_j(o_{t+1}) \sum_{i=1}^N a_{ij} \alpha_t(i)αt+1(j)=bj(ot+1)i=1Naijαt(i)

其中,j∈{1,2,...,N}j \in \{1,2,...,N\}j{1,2,...,N}

对递归过程的直观理解

t=1t=1t=1t=2t=2t=2两个时刻为例,它们之间涉及到的观测值和隐状态有:o1o_1o1o2o_2o2i1i_1i1i2i_2i2

i1→i2i_1 \to i_2i1i2

o1→o2o_1 \to o_2o1o2

当计算出α1(i)=P(o1,i1=qi∣λ),i∈{1,2,...,N}\alpha_1(i) = P(o_1,i_1 = q_i | \lambda), i \in \{1,2,...,N\}α1(i)=P(o1,i1=qiλ),i{1,2,...,N}后,我们手上的信息有:在时刻t=1t=1t=1,隐状态为q1q_1q1且观测值为o1o_1o1的概率α1(1)\alpha_1(1)α1(1)、…、隐状态为qNq_NqN且观测值为o1o_1o1的概率α1(N)\alpha_1(N)α1(N)

而计算α2(j)=P(o1,o2,i2=qj∣λ),j∈{1,2,...,N}\alpha_2(j) = P(o_1,o_2,i_2 = q_j | \lambda), j \in \{1,2,...,N\}α2(j)=P(o1,o2,i2=qjλ),j{1,2,...,N}意味着我们要求出:在时刻t=2t=2t=2,隐状态为q1q_1q1且过去两个观测值为o1o_1o1o2o_2o2的概率α2(1)\alpha_2(1)α2(1)、…、隐状态为qNq_NqN且过去两个观测值为o1o_1o1o2o_2o2的概率α2(N)\alpha_2(N)α2(N)

如何利用α1(i)\alpha_1(i)α1(i)来计算α2(j)\alpha_2(j)α2(j)

对比我们已有的信息、待求的信息,发现我们需要确定的是观测值o2o_2o2,而o2o_2o2是通过i2i_2i2决定(即bi2(o2)b_{i_2}(o_2)bi2(o2)),i2i_2i2又由i1i_1i1确定(即ai1i2a_{i_1 i_2}ai1i2)。因此,在每个α1(i)\alpha_1(i)α1(i)的基础上,再加入bi2(o2)b_{i_2}(o_2)bi2(o2)ai1i2a_{i_1 i_2}ai1i2这两个概率,就可求得α2(j)\alpha_2(j)α2(j)

α2(j)=∑i1=1Nα1(i)bi2(o2)ai1i2\alpha_2(j) = \sum_{i_1 = 1}^N \alpha_1(i) b_{i_2}(o_2) a_{i_1 i_2}α2(j)=i1=1Nα1(i)bi2(o2)ai1i2

稍作调整令i1=qi,i2=qji_1 = q_i, i_2 = q_ji1=qi,i2=qj,即可得:

α2(j)=∑i=1Nα1(i)bj(o2)aij=bj(o2)∑i=1Nα1(i)aij\alpha_2(j) = \sum_{i = 1}^N \alpha_1(i) b_{j}(o_2) a_{ij} = b_j(o_2) \sum_{i=1}^N \alpha_1(i) a_{ij}α2(j)=i=1Nα1(i)bj(o2)aij=bj(o2)i=1Nα1(i)aij

意义

为什么要计算前向概率?

  • 首先,前向概率可以帮助我们计算目标概率:P(O∣λ)P(O | \lambda)P(Oλ)。根据定义,t=Tt=Tt=T时刻的前向概率为:

αT(i)=P(o1,...,oT,iT=qi∣λ)\alpha_T(i) = P(o_1,...,o_T,i_T = q_i | \lambda)αT(i)=P(o1,...,oT,iT=qiλ)

因此,P(O∣λ)=∑i=1NαT(i)P(O | \lambda) = \sum_{i=1}^N \alpha_T(i)P(Oλ)=i=1NαT(i)

  • 其次,由于递归关系的存在,计算前向概率的工作量,远小于概率公式直接计算。注意到,i∈{1,2,...,N}i \in \{1,2,...,N\}i{1,2,...,N}。因此,计算α1(i)\alpha_1(i)α1(i)需进行NNN次运算;计算α2(i)\alpha_2(i)α2(i)需进行NNN次累加;…;计算αT(i)\alpha_T(i)αT(i)需进行NNN次累加。最终进行了N×TN \times TN×T次运算,远小于NTN^TNT
    计算量减少的原因在于,每一次计算直接引用前一个时刻的计算结果,避免重复计算。

后向算法(Backward Algorithm)

找出从时刻T→...→t→...→1T \to ... \to t \to ... \to 1T...t...1,后向概率的递归关系:

后向概率

在观测时间点1,...,t,...,T1,...,t,...,T1,...,t,...,T上,对应的观测值为o1,...,ot,...,oTo_1,...,o_t,...,o_To1,...,ot,...,oT,各隐状态分别为i1,...,it,...,iTi_1,...,i_t,...,i_Ti1,...,it,...,iT

i1→...→it→...→iTi_1 \to ... \to i_t \to ...\to i_Ti1...it...iTo1→...→ot→...→oTo_1 \to ... \to o_t \to ...\to o_To1...ot...oT

定义后向概率βt(i)=P(ot+1,...,oT∣it=qi,λ)\beta_t(i) = P(o_{t+1},...,o_T | i_t = q_i, \lambda)βt(i)=P(ot+1,...,oTit=qi,λ)

它表示:在ttt时刻的状态为qiq_iqi的条件下,对于ttt之后的所有时刻,观测序列的值为ot+1,ot+2,...,oTo_{t+1},o_{t+2},...,o_Tot+1,ot+2,...,oT的概率。

递归过程的公式推导

根据定义,写出t=Tt=Tt=Tt=T−1t=T-1t=T1t=T−2t=T-2t=T2的后向概率:

  • βT(i)=1\beta_T(i) = 1βT(i)=1

【注】:初始值等于111是因为,后向概率考量的是ttt时刻之后(不包括ttt时刻)的观测值序列,我们的观测序列只持续到时刻TTTTTT之后的观测值与状态都未知,所有的情况都是可能的,因此定义为111

  • βT−1(i)=P(oT∣iT−1=qi,λ)\beta_{T-1}(i) = P(o_T | i_{T-1} = q_i, \lambda)βT1(i)=P(oTiT1=qi,λ)
    =∑k=1NP(oT,iT=qk∣iT−1=qi,λ)= \sum_{k=1}^N P(o_T,i_T = q_k| i_{T-1} = q_i, \lambda)=k=1NP(oT,iT=qkiT1=qi,λ)
    =∑k=1NP(oT∣iT=qk,λ)P(iT=qk∣iT−1=qi,λ)= \sum_{k=1}^N P(o_T | i_T = q_k,\lambda) P(i_T = q_k | i_{T-1} = q_i, \lambda)=k=1NP(oTiT=qk,λ)P(iT=qkiT1=qi,λ)
    =∑k=1Nbk(oT)aik= \sum_{k=1}^N b_k(o_T) a_{ik}=k=1Nbk(oT)aik

  • βT−2(j)=P(oT,oT−1∣iT−2=qj,λ)\beta_{T-2}(j) = P(o_T,o_{T-1} | i_{T-2} = q_j, \lambda)βT2(j)=P(oT,oT1iT2=qj,λ)
    =∑i=1N∑k=1NP(oT,oT−1,iT=qk,iT−1=qi∣iT−2=qj,λ)= \sum_{i=1}^N \sum_{k=1}^N P(o_T,o_{T-1},i_T=q_k,i_{T-1}=q_i | i_{T-2} = q_j, \lambda)=i=1Nk=1NP(oT,oT1,iT=qk,iT1=qiiT2=qj,λ)
    =∑i=1N∑k=1NP(oT∣iT=qk,λ)P(iT=qk∣iT−1=qi,λ)P(oT−1∣iT−1=qi,λ)P(iT−1=qi∣iT−2=qj,λ)= \sum_{i=1}^N \sum_{k=1}^N P(o_T | i_T=q_k, \lambda) P(i_T=q_k | i_{T-1}=q_i, \lambda) P(o_{T-1} | i_{T-1}=q_i, \lambda) P(i_{T-1}=q_i | i_{T-2}=q_j, \lambda)=i=1Nk=1NP(oTiT=qk,λ)P(iT=qkiT1=qi,λ)P(oT1iT1=qi,λ)P(iT1=qiiT2=qj,λ)
    =∑i=1NβT−1(i)bi(oT−1)aji= \sum_{i=1}^N \beta_{T-1}(i) b_i(o_{T-1}) a_{ji}=i=1NβT1(i)bi(oT1)aji

.........

递推得到βt(j)\beta_t(j)βt(j)βt+1(i)\beta_{t+1}(i)βt+1(i)之间的关系:

βt(j)=∑i=1Nβt+1(i)bi(ot+1)aji\beta_t(j) = \sum_{i=1}^N \beta_{t+1}(i) b_i(o_{t+1}) a_{ji}βt(j)=i=1Nβt+1(i)bi(ot+1)aji

其中,j∈{1,2,...,N}j \in \{1,2,...,N\}j{1,2,...,N}

对递归过程的直观理解

t=T−1t = T-1t=T1t=T−2t = T-2t=T2两个时刻为例,它们之间涉及到的观测值和隐状态有:oT−2o_{T-2}oT2oT−1o_{T-1}oT1oTo_ToTiT−2i_{T-2}iT2iT−1i_{T-1}iT1iTi_TiT

iT−2→iT−1→iTi_{T-2} \to i_{T-1} \to i_TiT2iT1iT

oT−2→oT−1→oTo_{T-2} \to o_{T-1}\to o_ToT2oT1oT

当计算出βT−1(i)=P(oT∣iT−1=qi,λ),i∈{1,2,...,N}\beta_{T-1}(i) = P(o_T | i_{T-1} = q_i, \lambda), i \in \{1,2,...,N\}βT1(i)=P(oTiT1=qi,λ),i{1,2,...,N}后,我们手上的信息有:在时刻t=T−1t = T-1t=T1,隐状态为q1q_1q1的条件下,后面时刻的观测值为oTo_ToT的概率βT−1(1)\beta_{T-1}(1)βT1(1)、…、隐状态为qNq_NqN的条件下,后面时刻的观测值为oTo_ToT的概率βT−1(N)\beta_{T-1}(N)βT1(N)

而计算βT−2(j)=P(oT,oT−1∣iT−2=qj,λ),j∈{1,2,...,N}\beta_{T-2}(j) = P(o_T,o_{T-1} | i_{T-2} = q_j, \lambda), j \in \{1,2,...,N\}βT2(j)=P(oT,oT1iT2=qj,λ),j{1,2,...,N}意味着我们要求出:在时刻t=T−2t = T-2t=T2,隐状态为q1q_1q1的条件下,后面时刻的观测值为oTo_ToToT−1o_{T-1}oT1的概率βT−2(1)\beta_{T-2}(1)βT2(1)、…、隐状态为qNq_NqN的条件下,后面时刻的观测值为oTo_ToToT−1o_{T-1}oT1的概率βT−2(N)\beta_{T-2}(N)βT2(N)

如何利用βT−1(i)\beta_{T-1}(i)βT1(i)来计算βT−2(j)\beta_{T-2}(j)βT2(j)

对比我们已有的信息、待求的信息,发现我们需要确定的是观测值oT−1o_{T-1}oT1,而oT−1o_{T-1}oT1是通过iT−1i_{T-1}iT1决定(即biT−1(oT−1)b_{i_{T-1}}(o_{T-1})biT1(oT1)),iT−1i_{T-1}iT1又由iT−2i_{T-2}iT2确定(即aiT−2iT−1a_{i_{T-2} i_{T-1}}aiT2iT1)。因此,在每个βT−1(i)\beta_{T-1}(i)βT1(i)的基础上,再加入biT−1(oT−1)b_{i_{T-1}}(o_{T-1})biT1(oT1)aiT−2iT−1a_{i_{T-2} i_{T-1}}aiT2iT1这两个概率,就可求得βT−2(j)\beta_{T-2}(j)βT2(j)

βT−2(j)=∑iT−1=1NβT−1(i)biT−1(oT−1)aiT−2iT−1\beta_{T-2}(j) = \sum_{i_{T-1} = 1}^N \beta_{T-1}(i) b_{i_{T-1}}(o_{T-1}) a_{i_{T-2} i_{T-1}}βT2(j)=iT1=1NβT1(i)biT1(oT1)aiT2iT1

稍作调整令t=T−2,t+1=T−1,iT−1=qi,iT−2=qjt = T-2, t+1 = T-1, i_{T-1} = q_i, i_{T-2} = q_jt=T2,t+1=T1,iT1=qi,iT2=qj,即可得:

βt(j)=∑i=1Nβt+1(i)bi(ot+1)aji\beta_{t}(j) = \sum_{i = 1}^N \beta_{t+1}(i) b_{i}(o_{t+1}) a_{ji}βt(j)=i=1Nβt+1(i)bi(ot+1)aji

意义

为什么要计算后向概率?

  • 首先,后向概率也可以帮助我们计算目标概率:P(O∣λ)P(O | \lambda)P(Oλ)。根据定义,t=1t=1t=1时刻的后向概率为:

β1(i)=P(o2,...,oT∣i1=qi,λ)\beta_1(i) = P(o_2,...,o_T | i_1 = q_i, \lambda)β1(i)=P(o2,...,oTi1=qi,λ)

此时β1(i)\beta_1(i)β1(i)与目标概率P(O∣λ)P(O | \lambda)P(Oλ)相比,还差一个观测值o1o_1o1。由于所有的观测都相互独立,在t=1t=1t=1时刻、状态为qiq_iqi的条件下,观测值o1o_1o1出现的条件概率为:P(o1∣i1=qi,λ)=bi(o1)P(o_1 | i_1 = q_i, \lambda) = b_i(o_1) P(o1i1=qi,λ)=bi(o1)

两式相乘,得到所有观测值O=(o1,...,oT)O = (o_1,...,o_T)O=(o1,...,oT)t=1t=1t=1时刻、状态为qiq_iqi条件下的联合概率:P(o1,...,oT∣i1=qi,λ)=β1(i)bi(o1)P(o_1,...,o_T | i_1 = q_i, \lambda) = \beta_1(i) b_i(o_1)P(o1,...,oTi1=qi,λ)=β1(i)bi(o1)

因此,目标概率P(O∣λ)=∑i=1NP(o1,...,oT∣i1=qi,λ)P(i1=qi∣λ)=∑i=1Nβ1(i)bi(o1)πiP(O | \lambda) = \sum_{i=1}^N P(o_1,...,o_T | i_1 = q_i, \lambda) P(i_1 = q_i| \lambda ) = \sum_{i=1}^N \beta_1(i) b_i(o_1) \pi_iP(Oλ)=i=1NP(o1,...,oTi1=qi,λ)P(i1=qiλ)=i=1Nβ1(i)bi(o1)πi

  • 其次,后向概率与前向概率的计算量一样,最终进行了N×TN \times TN×T次运算,都远远小于概率公式直接计算的NTN^TNT项。

前向-后向算法(Forward-Backward Algorithm)

前向算法利用前向概率,从1→T1 \to T1T的方向计算P(O∣λ)P(O | \lambda)P(Oλ) = ∑i=1NαT(i)\sum_{i=1}^N \alpha_T(i)i=1NαT(i)

后向算法利用后向概率,从T→1T \to 1T1的方向计算P(O∣λ)P(O | \lambda)P(Oλ) = ∑i=1Nβ1(i)bi(o1)πi\sum_{i=1}^N \beta_1(i) b_i(o_1) \pi_ii=1Nβ1(i)bi(o1)πi

也可以同时用前向概率、后向概率计算P(O∣λ)P(O | \lambda)P(Oλ)

P(O∣λ)=∑i=1NP(O,it=qi∣λ)P(O | \lambda) = \sum_{i=1}^N P(O,i_t = q_i | \lambda)P(Oλ)=i=1NP(O,it=qiλ)

=∑i=1NP(O∣it=qi,λ)P(it=qi∣λ)= \sum_{i=1}^N P(O | i_t = q_i,\lambda) P(i_t = q_i | \lambda)=i=1NP(Oit=qi,λ)P(it=qiλ)

=∑i=1NP(o1,...,ot∣it=qi,λ)P(ot+1,...,oT∣it=qi,λ)P(it=qi∣λ)= \sum_{i=1}^N P(o_1,...,o_t | i_t = q_i,\lambda) P(o_{t+1},...,o_T | i_t = q_i,\lambda) P(i_t = q_i | \lambda)=i=1NP(o1,...,otit=qi,λ)P(ot+1,...,oTit=qi,λ)P(it=qiλ)

=∑i=1NP(o1,...,ot,it=qi∣λ)P(ot+1,...,oT∣it=qi,λ)= \sum_{i=1}^N P(o_1,...,o_t,i_t = q_i | \lambda) P(o_{t+1},...,o_T | i_t = q_i,\lambda)=i=1NP(o1,...,ot,it=qiλ)P(ot+1,...,oTit=qi,λ)

=∑i=1Nαt(i)βt(i)= \sum_{i=1}^N \alpha_t(i) \beta_t(i)=i=1Nαt(i)βt(i)

若利用后向概率的递推关系,替换βt(i)=∑j=1Nβt+1(j)bj(ot+1)aij\beta_{t}(i) = \sum_{j = 1}^N \beta_{t+1}(j) b_{j}(o_{t+1}) a_{ij}βt(i)=j=1Nβt+1(j)bj(ot+1)aij,又有:

P(O∣λ)=∑i=1N∑j=1Nαt(i)βt+1(j)bj(ot+1)aijP(O | \lambda) = \sum_{i=1}^N \sum_{j=1}^N \alpha_t(i) \beta_{t+1}(j) b_j(o_{t+1}) a_{ij}P(Oλ)=i=1Nj=1Nαt(i)βt+1(j)bj(ot+1)aij


其他概率的计算

利用前向、后向概率,还可以进行其他的计算:

  • 给定模型λ\lambdaλ,则观测序列为O=(o1,...,oT)O=(o_1,...,o_T)O=(o1,...,oT)、且ttt时刻的隐状态为qiq_iqi的概率:

P(O,it=qi∣λ)=αt(i)βt(i)P(O,i_t = q_i | \lambda) = \alpha_t(i) \beta_t(i)P(O,it=qiλ)=αt(i)βt(i)

  • 给定模型λ\lambdaλ和观测序列O=(o1,...,oT)O=(o_1,...,o_T)O=(o1,...,oT),则ttt时刻的隐状态为qiq_iqi的概率(单个状态):

P(it=qi∣O,λ)=P(O,it=qi∣λ)P(O∣λ)=αt(i)βt(i)∑j=1Nαt(j)βt(j)P(i_t = q_i | O,\lambda) = \frac{P(O,i_t = q_i | \lambda)}{P(O | \lambda)} = \frac{\alpha_t(i) \beta_t(i)}{\sum_{j=1}^N \alpha_t(j) \beta_t(j)}P(it=qiO,λ)=P(Oλ)P(O,it=qiλ)=j=1Nαt(j)βt(j)αt(i)βt(i)

  • 给定模型λ\lambdaλ和观测序列O=(o1,...,oT)O=(o_1,...,o_T)O=(o1,...,oT),则ttt时刻的隐状态为qiq_iqi、且t+1t+1t+1时刻的隐状态为qjq_jqj的概率(两个状态):

P(it=qi,it+1=qj∣O,λ)=P(O,it=qi,it+1=qj∣λ)P(O∣λ)=αt(i)βt+1(j)bj(ot+1)aij∑i=1N∑j=1Nαt(i)βt+1(j)bj(ot+1)aijP(i_t = q_i,i_{t+1} = q_j | O,\lambda) = \frac{P(O,i_t = q_i,i_{t+1} = q_j | \lambda)}{P(O | \lambda)} = \frac{\alpha_t(i) \beta_{t+1}(j) b_j(o_{t+1}) a_{ij}}{\sum_{i=1}^N \sum_{j=1}^N \alpha_t(i) \beta_{t+1}(j) b_j(o_{t+1}) a_{ij}}P(it=qi,it+1=qjO,λ)=P(Oλ)P(O,it=qi,it+1=qjλ)=i=1Nj=1Nαt(i)βt+1(j)bj(ot+1)aijαt(i)βt+1(j)bj(ot+1)aij

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值