基于Kaldi的DNN-HMM语音识别系统,run.sh文件包含从前期数据准备到最后解码的整个过程,该脚本是语音识别各个步骤的封装。
#!/bin/dash
#bash or dash
# Copyright 2016 Tsinghua University (Author: Dong Wang, Xuewei Zhang)
# 2018 Tsinghua University (Author: Zhiyuan Tang)
# Apache 2.0.
. ./cmd.sh ## You'll want to change cmd.sh to something that will work on your system.
## This relates to the queue.
## 根据系统将cmd.sh最后三行改成queue.pl or run.pl
. ./path.sh
n=8 # parallel jobs
set -euo pipefail # 管道符中任意命令出错或者遇到未定义变量或方法时都停止运行
###### Bookmark: basic preparation ######
# corpus and trans directory
# 指定数据集和训练文件目录
# 数据集用thchs30
thchs=/home/yy/kaldi-trunk/egs/cslt_cases/asr_baseline/data/data_thchs30
# 下载数据集
# you can obtain the database by uncommting the following lines
# [ -d $thchs ] || mkdir -p $thchs
# echo "downloading THCHS30 at $thchs ..."
# local/download_and_untar.sh $thchs http://www.openslr.org/resources/18 data_thchs30
# local/download_and_untar.sh $thchs http://www.openslr.org/r