假设已经训练好了chain模型,想用chain模型对齐的话(正常都是用GMM对齐),这个对齐要有y哦
1 准备数据
这里是参考 steps/nnet3/align.sh
中的脚本,首先要的数据必须是<音频,分词的标注结果>
,这里的数据准备其实和ASR数据准备一样。
2 计算特征得到fests.scp
. ./path.sh
. ./cmd.sh
# 这里提取的是13维的
mfccdir=mfcc
steps/make_mfcc.sh --cmd "$train_cmd" --nj 30 data/test exp/make_mfcc/test $mfccdir || exit 1;
steps/compute_cmvn_stats.sh data/test exp/make_mfcc/test $mfccdir || exit 1;
utils/fix_data_dir.sh data/test
. ./path.sh
. ./cmd.sh
# mfcc的hires特征提取,这里提取的是40维的,用于chain模型
mfccdir=mfcc_perturbed_hires
steps/make_mfcc.sh --nj 30 --mfcc-config conf/mfcc_hires.conf \
--cmd "$train_cmd" data/test_hires exp/make_hires/test $mfccdir
steps/compute_cmvn_stats.sh data/test_hires exp/make_hires/test $mfccdir
utils/fix_data_dir.sh data/test_hires
# create MFCC data dir without pitch to extract iVector
#utils/data/limit_feature_dim.sh 0:39 data/test_hires data/test_hires_nopitch
#steps/compute_cmvn_stats.sh data/test_hires_nopitch exp/make_hires/test $mfccdir
【注意】
- 【1】第一步先提取了基础mfcc特征,再提取了mfcc_hires特征,hires特征维度解释
- 【2】exp/make_hires/test 提取特征时保存的日志,
最终用到的是mfcc_hires特征(chain模型所需要的)
conf/mfcc_hires.conf文件内容:
# config for high-resolution MFCC features, intended for neural network training.
# Note: we keep all cepstra, so it has the same info as filterbank features,
# but MFCC is more easily compressible (because less correlated) which is why
# we prefer this method.
--use-energy=false # use average of log energy, not energy.
--sample-frequency=16000 # Switchboard is sampled at 8kHz
--num-mel-bins=40 # similar to Google's setup.
--num-ceps=40 # there is no dimensionality reduction.
--low-freq=40 # low cutoff frequency for mel bins
--high-freq=-200 # high cutoff frequently, relative to Nyquist of 8000 (=3800)
3 构建解码图和对齐
这部分主要参考的是steps/nnet3/align.sh,当然也可以直接用这个脚本直接用也行。但是这里没用到ivector特征就是这值没有传,用到的自己传一下
如果需要使用提取i-vector特征,提取方法见run_ivector_common.sh
steps/nnet3/align.sh --use_gpu true --nj 30 --beam 30 \
#--online_ivector_dir exp/nnet3/ivectors_test \
data/test_hires data/lang exp/nnet3/tdnn_sp exp/nnet3_sp_ali
feats