#1 语音识别技术原理
(未整理->可参考链接:语音识别的技术原理是什么?)
#2 语音识别特征提取方法:MFCC
(未整理->可参考链接:)
3语音识别编程语言包
未整理-可参考链接:几种不同程序语言的HMM版本
#4 推荐论文
未整理-可参考链接:深度学习入门论文(语音识别领域)
5 语音识别书籍
未整理-可参考链接:语音识别实践-英文版
密码:zmja
6 关键模型:GMM-HMM
#6.1符号定义
a
i
j
=
a
i
j
(
t
)
=
P
(
q
t
=
i
,
q
t
+
1
=
j
)
a_{ij}=a_{ij}(t)=P(q_t=i,q_{t+1}=j)
aij=aij(t)=P(qt=i,qt+1=j):由
t
t
t时刻的状态
i
i
i到
t
+
1
t+1
t+1时刻的状态
j
j
j的转移概率且有
a
i
j
(
t
)
=
a
i
j
(
t
+
1
)
,
t
=
1
,
2
,
.
.
.
,
T
a_{ij}(t)=a_{ij}(t+1),t=1,2,...,T
aij(t)=aij(t+1),t=1,2,...,T
b
j
(
o
t
)
=
P
(
q
t
=
j
,
o
t
)
b_j(o_t)=P(q_t=j,o_t)
bj(ot)=P(qt=j,ot):在第t时刻状态j发出
o
t
o_t
ot的概率
隐藏状态集合
I
I
I且
∣
I
∣
=
n
|I|=n
∣I∣=n,
∣
∗
∣
|*|
∣∗∣表示集合个数
时间状态集合
Q
=
{
q
1
↦
q
2
↦
.
.
.
↦
q
T
∣
q
t
∈
I
,
t
=
1
,
2
,
.
.
.
,
T
}
Q=\{q_1 \mapsto q_2 \mapsto ... \mapsto q_T|q_t\in I,t=1,2,...,T\}
Q={q1↦q2↦...↦qT∣qt∈I,t=1,2,...,T}
输出集合
O
O
O,
O
=
{
o
1
↦
o
2
↦
o
3
↦
,
.
.
.
.
,
↦
o
n
}
O=\{o_1 \mapsto o_2 \mapsto o_3\mapsto ,....,\mapsto o_n\}
O={o1↦o2↦o3↦,....,↦on}
参数集合
λ
=
{
a
,
b
∣
a
i
j
∈
a
,
b
j
∈
b
,
i
=
1
,
2
,
.
.
.
,
n
,
j
=
1
,
2
,
3
,
.
.
.
,
n
}
\lambda=\{a,b|a_{ij}\in a,b_j\in b,i=1,2,...,n,j=1,2,3,...,n\}
λ={a,b∣aij∈a,bj∈b,i=1,2,...,n,j=1,2,3,...,n}
6.2: GMM-HMM模型示意图
6.3:推导过程
该推导最终目标是期望求解GMM-HMM模型中涉及参数:
λ
\lambda
λ,
采用极大似然估计方法进行求解:
max
P
(
O
∣
λ
)
=
max
log
(
P
(
O
∣
λ
)
)
=
max
log
(
∑
Q
P
(
O
,
Q
∣
λ
)
)
\max P(O|\lambda)=\max \log(P(O|\lambda))=\max \log(\sum_Q P(O,Q|\lambda))
maxP(O∣λ)=maxlog(P(O∣λ))=maxlog(∑QP(O,Q∣λ))
直接求解带有隐藏状态的
λ
\lambda
λ是比较困难的,采用EM算法进行迭代求解,详细方法见参考文章1:
max
log
(
∑
Q
P
(
O
,
Q
∣
λ
)
)
=
max
log
(
∑
Q
(
P
(
Q
∣
O
,
λ
ˉ
)
P
(
O
,
Q
∣
λ
)
P
(
Q
∣
O
,
λ
ˉ
)
)
)
≥
max
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
log
(
P
(
O
,
Q
∣
λ
)
P
(
Q
∣
O
,
λ
ˉ
)
)
\max \log(\sum_Q P(O,Q|\lambda))=\max \log(\sum_Q(P(Q|O,\bar{\lambda})\frac{P(O,Q|\lambda)}{P(Q|O,\bar{\lambda})}))\geq \max \sum_QP(Q|O,\bar{\lambda})\log(\frac{P(O,Q|\lambda)}{P(Q|O,\bar{\lambda})})
maxlog(∑QP(O,Q∣λ))=maxlog(∑Q(P(Q∣O,λˉ)P(Q∣O,λˉ)P(O,Q∣λ)))≥max∑QP(Q∣O,λˉ)log(P(Q∣O,λˉ)P(O,Q∣λ)) ,
其中
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
=
1
\sum_QP(Q|O,\bar{\lambda})=1
∑QP(Q∣O,λˉ)=1
由于
P
(
Q
∣
O
,
λ
ˉ
)
P(Q|O,\bar{\lambda})
P(Q∣O,λˉ)是常数,所以上式等价于
max
P
(
O
∣
λ
)
≥
max
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
log
(
P
(
O
,
Q
∣
λ
)
)
\max P(O|\lambda)\geq \max \sum_QP(Q|O,\bar{\lambda})\log(P(O,Q|\lambda))
maxP(O∣λ)≥max∑QP(Q∣O,λˉ)log(P(O,Q∣λ)),
该公式可以看出
max
P
(
O
∣
λ
)
\max P(O|\lambda)
maxP(O∣λ)的下界是
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
log
(
P
(
O
,
Q
∣
λ
)
)
\sum_QP(Q|O,\bar{\lambda})\log(P(O,Q|\lambda))
∑QP(Q∣O,λˉ)log(P(O,Q∣λ))的上面,因此只要不断迭代优化
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
log
(
P
(
O
,
Q
∣
λ
)
)
\sum_QP(Q|O,\bar{\lambda})\log(P(O,Q|\lambda))
∑QP(Q∣O,λˉ)log(P(O,Q∣λ)),那么当
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
log
(
P
(
O
,
Q
∣
λ
)
)
\sum_QP(Q|O,\bar{\lambda})\log(P(O,Q|\lambda))
∑QP(Q∣O,λˉ)log(P(O,Q∣λ))取得最优的时候,
max
P
(
O
∣
λ
)
\max P(O|\lambda)
maxP(O∣λ)的参数
λ
\lambda
λ便是最优参数。
由于
P
(
O
,
Q
∣
λ
)
=
π
q
0
⋅
a
q
0
q
1
b
q
1
(
x
1
)
⋅
a
q
1
q
2
b
q
2
(
x
2
)
⋅
a
q
2
q
3
b
q
2
(
x
2
)
⋅
.
.
.
.
a
q
T
−
1
q
T
b
q
T
(
x
T
)
P(O,Q|\lambda) =\pi_{q_0}\cdot a_{q_0q_1}b_{q_1}(x_1)\cdot a_{q_1q_2}b_{q_2}(x_2)\cdot a_{q_2q_3}b_{q_2}(x_2)\cdot .... a_{q_{T-1}q_{T}}b_{q_T}(x_T)
P(O,Q∣λ)=πq0⋅aq0q1bq1(x1)⋅aq1q2bq2(x2)⋅aq2q3bq2(x2)⋅....aqT−1qTbqT(xT)
所以有:
log
(
P
(
O
,
Q
∣
λ
)
)
=
log
(
π
q
0
⋅
a
q
0
q
1
b
q
1
(
x
1
)
⋅
a
q
1
q
2
b
q
2
(
x
2
)
⋅
a
q
2
q
3
b
q
2
(
x
2
)
⋅
.
.
.
.
a
q
T
−
1
q
T
b
q
T
(
x
T
)
)
=
l
o
g
(
π
q
0
)
+
l
o
g
(
b
q
1
(
x
1
)
b
q
2
(
x
2
)
.
.
.
b
q
T
(
x
T
)
)
+
log
(
a
q
0
q
1
a
q
1
q
2
.
.
.
a
q
T
−
1
q
T
)
=
log
(
π
q
0
)
+
∑
t
=
1
T
log
(
b
q
t
(
x
t
)
)
+
∑
t
=
1
T
log
(
a
q
t
−
1
q
t
)
\log(P(O,Q|\lambda)) =\log(\pi_{q_0}\cdot a_{q_0q_1}b_{q_1}(x_1)\cdot a_{q_1q_2}b_{q_2}(x_2)\cdot a_{q_2q_3}b_{q_2}(x_2)\cdot ....a_{q_{T-1}q_{T}}b_{q_T}(x_T)) =log(\pi_{q_0})+log(b_{q_1(x_1)}b_{q_2(x_2)}...b_{q_T(x_T)})+\log(a_{q_0q_1}a_{q_1q_2}...a_{q_{T-1}q_T}) =\log(\pi_{q_0})+\sum_{t=1}^T\log(b_{q_t}(x_t))+\sum_{t=1}^T\log(a_{q_{t-1}q_{t}})
log(P(O,Q∣λ))=log(πq0⋅aq0q1bq1(x1)⋅aq1q2bq2(x2)⋅aq2q3bq2(x2)⋅....aqT−1qTbqT(xT))=log(πq0)+log(bq1(x1)bq2(x2)...bqT(xT))+log(aq0q1aq1q2...aqT−1qT)=log(πq0)+∑t=1Tlog(bqt(xt))+∑t=1Tlog(aqt−1qt)
往下有:
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
log
(
P
(
O
,
Q
∣
λ
)
)
=
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
(
log
(
π
q
0
)
+
∑
t
=
1
T
log
(
b
q
t
(
x
t
)
)
+
∑
t
=
1
T
log
(
a
q
t
−
1
q
t
)
)
=
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
log
(
π
q
0
)
+
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
∑
t
=
1
T
log
(
b
q
t
(
x
t
)
)
+
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
∑
t
=
1
T
log
(
a
q
t
−
1
q
t
)
=
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
log
(
π
q
0
)
+
∑
Q
∑
t
=
1
T
P
(
Q
∣
O
,
λ
ˉ
)
log
(
b
q
t
(
x
t
)
)
+
∑
Q
∑
t
=
1
T
P
(
Q
∣
O
,
λ
ˉ
)
log
(
a
q
t
−
1
q
t
)
\sum_QP(Q|O,\bar{\lambda})\log(P(O,Q|\lambda)) =\sum_QP(Q|O,\bar{\lambda})(\log(\pi_{q_0})+\sum_{t=1}^T\log(b_{q_t}(x_t))+\sum_{t=1}^T\log(a_{q_{t-1}q_{t}})) =\sum_QP(Q|O,\bar{\lambda})\log(\pi_{q_0})+\sum_QP(Q|O,\bar{\lambda})\sum_{t=1}^T\log(b_{q_t}(x_t))+\sum_QP(Q|O,\bar{\lambda})\sum_{t=1}^T\log(a_{q_{t-1}q_{t}}) =\sum_QP(Q|O,\bar{\lambda})\log(\pi_{q_0})+\sum_Q\sum_{t=1}^TP(Q|O,\bar{\lambda})\log(b_{q_t}(x_t))+\sum_Q\sum_{t=1}^TP(Q|O,\bar{\lambda})\log(a_{q_{t-1}q_{t}})
∑QP(Q∣O,λˉ)log(P(O,Q∣λ))=∑QP(Q∣O,λˉ)(log(πq0)+∑t=1Tlog(bqt(xt))+∑t=1Tlog(aqt−1qt))=∑QP(Q∣O,λˉ)log(πq0)+∑QP(Q∣O,λˉ)∑t=1Tlog(bqt(xt))+∑QP(Q∣O,λˉ)∑t=1Tlog(aqt−1qt)=∑QP(Q∣O,λˉ)log(πq0)+∑Q∑t=1TP(Q∣O,λˉ)log(bqt(xt))+∑Q∑t=1TP(Q∣O,λˉ)log(aqt−1qt)
对上面式子换一种写法:
∑
Q
P
(
Q
∣
O
,
λ
ˉ
)
log
(
π
q
0
)
+
∑
Q
∑
t
=
1
T
P
(
Q
∣
O
,
λ
ˉ
)
log
(
b
q
t
(
x
t
)
)
+
∑
Q
∑
t
=
1
T
P
(
Q
∣
O
,
λ
ˉ
)
log
(
a
q
t
−
1
q
t
)
=
∑
i
=
1
n
P
(
q
0
=
i
∣
O
,
λ
ˉ
)
log
(
π
q
0
)
+
∑
i
=
1
n
∑
t
=
1
T
P
(
q
t
=
i
∣
O
,
λ
ˉ
)
log
(
b
q
t
(
x
t
)
)
+
∑
i
=
1
n
∑
j
=
1
n
∑
t
=
1
T
P
(
q
t
=
i
,
q
t
+
1
=
j
∣
O
,
λ
ˉ
)
log
(
a
i
j
)
\sum_QP(Q|O,\bar{\lambda})\log(\pi_{q_0})+\sum_Q\sum_{t=1}^TP(Q|O,\bar{\lambda})\log(b_{q_t}(x_t))+\sum_Q\sum_{t=1}^TP(Q|O,\bar{\lambda})\log(a_{q_{t-1}q_{t}}) =\sum_{i=1}^nP(q_0=i|O,\bar{\lambda})\log(\pi_{q_0})+\sum_{i=1}^n\sum_{t=1}^TP(q_t=i|O,\bar{\lambda})\log(b_{q_t}(x_t))+\sum_{i=1}^n\sum_{j=1}^n\sum_{t=1}^TP(q_t=i,q_{t+1}=j|O,\bar{\lambda})\log(a_{ij})
∑QP(Q∣O,λˉ)log(πq0)+∑Q∑t=1TP(Q∣O,λˉ)log(bqt(xt))+∑Q∑t=1TP(Q∣O,λˉ)log(aqt−1qt)=∑i=1nP(q0=i∣O,λˉ)log(πq0)+∑i=1n∑t=1TP(qt=i∣O,λˉ)log(bqt(xt))+∑i=1n∑j=1n∑t=1TP(qt=i,qt+1=j∣O,λˉ)log(aij)
令
L
(
a
,
b
)
=
∑
i
=
1
n
P
(
q
0
=
i
∣
O
,
λ
ˉ
)
log
(
π
q
0
)
+
∑
i
=
1
n
∑
t
=
1
T
P
(
q
t
=
i
∣
O
,
λ
ˉ
)
log
(
b
q
t
(
x
t
)
)
+
∑
i
=
1
n
∑
j
=
1
n
∑
t
=
1
T
P
(
q
t
=
i
,
q
t
+
1
=
j
∣
O
,
λ
ˉ
)
log
(
a
i
j
)
L(a,b)=\sum_{i=1}^nP(q_0=i|O,\bar{\lambda})\log(\pi_{q_0})+\sum_{i=1}^n\sum_{t=1}^TP(q_t=i|O,\bar{\lambda})\log(b_{q_t}(x_t))+\sum_{i=1}^n\sum_{j=1}^n\sum_{t=1}^TP(q_t=i,q_{t+1}=j|O,\bar{\lambda})\log(a_{ij})
L(a,b)=∑i=1nP(q0=i∣O,λˉ)log(πq0)+∑i=1n∑t=1TP(qt=i∣O,λˉ)log(bqt(xt))+∑i=1n∑j=1n∑t=1TP(qt=i,qt+1=j∣O,λˉ)log(aij)
且定义:
L
(
a
,
b
)
=
L
1
(
a
,
b
)
+
L
2
(
a
,
b
)
+
L
3
(
a
,
b
)
L(a,b) =L_1(a,b)+L_2(a,b)+L_3(a,b)
L(a,b)=L1(a,b)+L2(a,b)+L3(a,b)
L
1
(
a
,
b
)
=
∑
i
=
1
n
P
(
q
0
=
i
∣
O
,
λ
ˉ
)
log
(
π
q
0
)
L_1(a,b) = \sum_{i=1}^nP(q_0=i|O,\bar{\lambda})\log(\pi_{q_0})
L1(a,b)=i=1∑nP(q0=i∣O,λˉ)log(πq0)
L
2
(
a
,
b
)
=
∑
i
=
1
n
∑
t
=
1
T
P
(
q
t
=
i
∣
O
,
λ
ˉ
)
log
(
b
q
t
(
x
t
)
)
L_2(a,b) = \sum_{i=1}^n\sum_{t=1}^TP(q_t=i|O,\bar{\lambda})\log(b_{q_t}(x_t))
L2(a,b)=i=1∑nt=1∑TP(qt=i∣O,λˉ)log(bqt(xt))
L
3
(
a
,
b
)
=
∑
i
=
1
n
∑
j
=
1
n
∑
t
=
1
T
P
(
q
t
=
i
,
q
t
+
1
=
j
∣
O
,
λ
ˉ
)
log
(
a
i
j
)
L_3(a,b) = \sum_{i=1}^n\sum_{j=1}^n\sum_{t=1}^TP(q_t=i,q_{t+1}=j|O,\bar{\lambda})\log(a_{ij})
L3(a,b)=i=1∑nj=1∑nt=1∑TP(qt=i,qt+1=j∣O,λˉ)log(aij)
可以看出
L
2
(
a
,
b
)
L_2(a,b)
L2(a,b)和
L
3
(
a
,
b
)
L_3(a,b)
L3(a,b)是可以相互独立优化的部分,其中
L
2
(
a
,
b
)
L_2(a,b)
L2(a,b)是关于参数
b
b
b的优化项,
L
3
(
a
,
b
)
L_3(a,b)
L3(a,b)是关于参数
a
a
a的优化项,而
L
1
(
a
,
b
)
L_1(a,b)
L1(a,b)虽然跟
L
2
(
a
,
b
)
L_2(a,b)
L2(a,b)和
L
3
(
a
,
b
)
L_3(a,b)
L3(a,b)相关联,但在实际求解过程中可以先固定
q
0
q_0
q0的取值,然后再优化
L
2
(
a
,
b
)
L_2(a,b)
L2(a,b)和
L
3
(
a
,
b
)
L_3(a,b)
L3(a,b),可进一步阅读参考文章2
在求解
λ
=
{
a
,
b
}
\lambda=\{a,b\}
λ={a,b}的参数之前,需要先推导一下
L
2
(
a
,
b
)
L_2(a,b)
L2(a,b)和
L
3
(
a
,
b
)
L_3(a,b)
L3(a,b)
中的
P
(
q
t
=
i
∣
O
,
λ
ˉ
)
P(q_t=i|O,\bar{\lambda})
P(qt=i∣O,λˉ)和
P
(
q
t
=
i
,
q
t
+
1
=
j
∣
O
,
λ
ˉ
)
P(q_t=i,q_{t+1}=j|O,\bar{\lambda})
P(qt=i,qt+1=j∣O,λˉ)的计算方法,即比较被熟知的
前向(Forward)算法和后向算法。在推导过程中,贝叶斯公式
P
(
A
B
)
=
P
(
A
∣
B
)
P
(
B
)
P(AB)=P(A|B)P(B)
P(AB)=P(A∣B)P(B)及其变形
P
(
A
∣
B
)
=
P
(
A
B
)
P
(
B
)
P(A|B) = \frac{P(AB)}{P(B)}
P(A∣B)=P(B)P(AB) 将被反复使用。
在推导之前,先定义符号
O
t
t
+
m
=
{
o
t
→
o
t
+
1
→
.
.
.
→
o
t
+
m
}
O_t^{t+m}=\{o_{t}\rightarrow o_{t+1}\rightarrow ...\rightarrow o_{t+m}\}
Ott+m={ot→ot+1→...→ot+m}
P
(
q
t
=
i
∣
O
,
λ
ˉ
)
=
P
(
q
t
=
i
,
O
∣
λ
ˉ
)
P
(
O
∣
λ
ˉ
)
=
P
(
q
t
=
i
,
O
1
T
∣
λ
ˉ
)
P
(
O
∣
λ
ˉ
)
=
P
(
q
t
=
i
,
O
1
t
,
O
t
+
1
T
∣
λ
ˉ
)
P
(
O
∣
λ
ˉ
)
=
P
(
O
t
+
1
T
∣
q
t
=
i
,
O
1
t
,
λ
ˉ
)
P
(
q
t
=
i
,
O
1
t
∣
λ
ˉ
)
P
(
O
∣
λ
ˉ
)
=
P
(
O
t
+
1
T
∣
q
t
=
i
,
λ
ˉ
)
P
(
q
t
=
i
,
O
1
t
∣
λ
ˉ
)
P
(
O
∣
λ
ˉ
)
P(q_t=i|O,\bar{\lambda})= \frac{P(q_t=i,O|\bar{\lambda})}{P(O|\bar{\lambda})} = \frac{P(q_t=i,O_1^T|\bar{\lambda})}{P(O|\bar{\lambda})} = \frac{P(q_t=i,O_1^t,O_{t+1}^T|\bar{\lambda})}{P(O|\bar{\lambda})} =\frac{P(O_{t+1}^T|q_t=i,O_1^t,\bar{\lambda})P(q_t=i,O_1^t|\bar{\lambda})}{P(O|\bar{\lambda})} =\frac{P(O_{t+1}^T|q_t=i,\bar{\lambda})P(q_t=i,O_1^t|\bar{\lambda})}{P(O|\bar{\lambda})}
P(qt=i∣O,λˉ)=P(O∣λˉ)P(qt=i,O∣λˉ)=P(O∣λˉ)P(qt=i,O1T∣λˉ)=P(O∣λˉ)P(qt=i,O1t,Ot+1T∣λˉ)=P(O∣λˉ)P(Ot+1T∣qt=i,O1t,λˉ)P(qt=i,O1t∣λˉ)=P(O∣λˉ)P(Ot+1T∣qt=i,λˉ)P(qt=i,O1t∣λˉ)
再定义
α
t
(
i
)
=
P
(
q
t
=
i
,
O
1
t
∣
λ
ˉ
)
,
β
t
(
i
)
=
P
(
O
t
+
1
T
∣
q
t
=
i
,
λ
ˉ
)
\alpha_t(i)= P(q_t=i,O_1^t|\bar{\lambda}),\beta_t(i) = P(O_{t+1}^T|q_t=i,\bar{\lambda})
αt(i)=P(qt=i,O1t∣λˉ),βt(i)=P(Ot+1T∣qt=i,λˉ)
那么
P
(
q
t
=
i
∣
O
,
λ
ˉ
)
=
α
t
(
i
)
β
t
(
i
)
P(q_t=i|O,\bar{\lambda}) = \alpha_t(i)\beta_t(i)
P(qt=i∣O,λˉ)=αt(i)βt(i), 其中,
α
t
(
i
)
=
P
(
q
t
=
i
,
O
1
t
∣
λ
ˉ
)
=
∑
h
=
1
n
P
(
q
t
−
1
=
h
,
q
t
=
i
,
O
1
t
−
1
,
o
t
∣
λ
ˉ
)
=
∑
h
=
1
n
P
(
q
t
=
i
,
o
t
∣
q
t
−
1
=
h
,
O
1
t
−
1
,
λ
ˉ
)
P
(
q
t
−
1
=
h
,
O
1
t
−
1
,
λ
ˉ
)
=
∑
h
=
1
n
P
(
q
t
=
i
,
o
t
∣
q
t
−
1
=
h
)
P
(
q
t
−
1
=
h
,
O
1
t
−
1
,
λ
ˉ
)
=
∑
h
=
1
n
P
(
o
t
∣
q
t
=
i
,
q
t
−
1
=
h
)
P
(
q
t
=
i
,
q
t
−
1
=
h
)
P
(
q
t
−
1
=
h
,
O
1
t
−
1
,
λ
ˉ
)
=
∑
h
=
1
n
b
i
(
o
t
)
a
h
i
α
t
−
1
(
h
)
=
∑
h
=
1
n
α
t
−
1
(
h
)
a
h
i
b
i
(
o
t
)
\alpha_t(i) = P(q_t=i,O_1^t|\bar{\lambda}) = \sum_{h=1}^n P(q_{t-1}=h,q_t=i,O_1^{t-1},o_t|\bar{\lambda}) = \sum_{h=1}^n P(q_t=i,o_t|q_{t-1}=h,O_1^{t-1},\bar{\lambda})P(q_{t-1}=h,O_1^{t-1},\bar{\lambda}) = \sum_{h=1}^n P(q_t=i,o_t|q_{t-1}=h)P(q_{t-1}=h,O_1^{t-1},\bar{\lambda}) = \sum_{h=1}^n P(o_t|q_t=i,q_{t-1}=h)P(q_t=i,q_{t-1}=h)P(q_{t-1}=h,O_1^{t-1},\bar{\lambda}) = \sum_{h=1}^n b_i(o_t)a_{hi}\alpha_{t-1}(h) =\sum_{h=1}^n \alpha_{t-1}(h)a_{hi}b_i(o_t)
αt(i)=P(qt=i,O1t∣λˉ)=∑h=1nP(qt−1=h,qt=i,O1t−1,ot∣λˉ)=∑h=1nP(qt=i,ot∣qt−1=h,O1t−1,λˉ)P(qt−1=h,O1t−1,λˉ)=∑h=1nP(qt=i,ot∣qt−1=h)P(qt−1=h,O1t−1,λˉ)=∑h=1nP(ot∣qt=i,qt−1=h)P(qt=i,qt−1=h)P(qt−1=h,O1t−1,λˉ)=∑h=1nbi(ot)ahiαt−1(h)=∑h=1nαt−1(h)ahibi(ot)
和:
β
t
(
i
)
=
β
t
(
i
)
=
P
(
O
t
+
1
T
∣
q
t
=
i
,
λ
ˉ
)
=
∑
j
=
1
n
P
(
q
t
+
1
=
j
,
O
t
+
1
T
∣
q
t
=
i
,
λ
ˉ
)
=
∑
j
=
1
n
P
(
q
t
+
1
=
j
,
o
t
+
1
,
O
t
+
2
T
∣
q
t
=
i
,
λ
ˉ
)
=
∑
j
=
1
n
P
(
O
t
+
2
T
∣
q
t
+
1
=
j
,
o
t
+
1
,
q
t
=
i
,
λ
ˉ
)
P
(
q
t
+
1
=
j
,
o
t
+
1
,
q
t
=
i
,
λ
ˉ
)
=
∑
j
=
1
n
P
(
O
t
+
2
T
∣
q
t
+
1
=
j
)
P
(
q
t
+
1
=
j
,
o
t
+
1
,
q
t
=
i
,
λ
ˉ
)
=
∑
j
=
1
n
β
t
+
1
(
j
)
P
(
q
t
+
1
=
j
,
o
t
+
1
,
q
t
=
i
,
λ
ˉ
)
=
∑
j
=
1
n
β
t
+
1
(
j
)
P
(
o
t
+
1
∣
q
t
+
1
=
j
,
λ
ˉ
)
P
(
q
t
+
1
=
j
∣
q
t
=
i
,
λ
ˉ
)
=
∑
j
=
1
n
β
t
+
1
(
j
)
a
i
j
b
j
(
o
t
+
1
)
=
∑
j
=
1
n
a
i
j
b
j
(
o
t
+
1
)
β
t
+
1
(
j
)
\beta_t(i) = \beta_t(i) = P(O_{t+1}^T|q_t=i,\bar{\lambda}) = \sum_{j=1}^nP(q_{t+1}=j,O_{t+1}^T|q_t=i,\bar{\lambda}) = \sum_{j=1}^nP(q_{t+1}=j,o_{t+1},O_{t+2}^T|q_t=i,\bar{\lambda}) = \sum_{j=1}^nP(O_{t+2}^T|q_{t+1}=j,o_{t+1},q_t=i,\bar{\lambda})P(q_{t+1}=j,o_{t+1},q_t=i,\bar{\lambda}) = \sum_{j=1}^nP(O_{t+2}^T|q_{t+1}=j)P(q_{t+1}=j,o_{t+1},q_t=i,\bar{\lambda}) = \sum_{j=1}^n \beta_{t+1}(j) P(q_{t+1}=j,o_{t+1},q_t=i,\bar{\lambda}) = \sum_{j=1}^n \beta_{t+1}(j) P(o_{t+1}|q_{t+1}=j,\bar{\lambda})P(q_{t+1}=j|q_{t}=i,\bar{\lambda}) = \sum_{j=1}^n \beta_{t+1}(j) a_{ij} bj(o_{t+1}) = \sum_{j=1}^n a_{ij} bj(o_{t+1})\beta_{t+1}(j)
βt(i)=βt(i)=P(Ot+1T∣qt=i,λˉ)=∑j=1nP(qt+1=j,Ot+1T∣qt=i,λˉ)=∑j=1nP(qt+1=j,ot+1,Ot+2T∣qt=i,λˉ)=∑j=1nP(Ot+2T∣qt+1=j,ot+1,qt=i,λˉ)P(qt+1=j,ot+1,qt=i,λˉ)=∑j=1nP(Ot+2T∣qt+1=j)P(qt+1=j,ot+1,qt=i,λˉ)=∑j=1nβt+1(j)P(qt+1=j,ot+1,qt=i,λˉ)=∑j=1nβt+1(j)P(ot+1∣qt+1=j,λˉ)P(qt+1=j∣qt=i,λˉ)=∑j=1nβt+1(j)aijbj(ot+1)=∑j=1naijbj(ot+1)βt+1(j)