Howl 项目安装和配置指南

骆朵绮

于 2024-10-18 10:33:24 发布

阅读量885

点赞数 26

CC 4.0 BY-SA版权

本文链接：https://blog.youkuaiyun.com/gitblog_01201/article/details/143036659

Howl 项目安装和配置指南

howl Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice. 项目地址: https://gitcode.com/gh_mirrors/how/howl

1. 项目基础介绍和主要编程语言

Howl 是一个用于 Firefox Voice 的唤醒词检测建模工具包，支持开放数据集如 Speech Commands 和 Common Voice。该项目的主要编程语言是 Python，适合用于开发语音识别和唤醒词检测的应用程序。

2. 项目使用的关键技术和框架

Howl 项目使用了以下关键技术和框架：

Python: 主要编程语言，用于实现唤醒词检测的模型和工具。
PyTorch: 深度学习框架，用于训练和运行唤醒词检测模型。
Montreal Forced Aligner (MFA): 用于生成音频文件的正字法转录对齐。
PyAudio: 用于处理音频输入和输出的库。

3. 项目安装和配置的准备工作和详细安装步骤

准备工作

在开始安装之前，请确保您的系统满足以下要求：

安装了 Python 3.6 或更高版本。
安装了 Git，用于克隆项目仓库。
安装了 PyTorch，根据您的平台选择合适的安装方式。
安装了 PyAudio 及其依赖项。

详细安装步骤

步骤 1: 克隆项目仓库

首先，使用 Git 克隆 Howl 项目的仓库到您的本地机器：

git clone https://github.com/castorini/howl.git
cd howl

步骤 2: 安装 PyTorch

根据您的操作系统，安装 PyTorch。例如，在 Linux 上，可以使用以下命令：

pip install torch

步骤 3: 安装 PyAudio

使用您的包管理器安装 PyAudio 及其依赖项。例如，在 Ubuntu 上，可以使用以下命令：

sudo apt-get install python3-pyaudio

步骤 4: 安装项目依赖

使用 pip 安装 Howl 项目所需的其他依赖项：

pip install -r requirements.txt
pip install -r requirements_training.txt

步骤 5: 下载 Montreal Forced Aligner (MFA)

运行以下脚本下载 MFA 并设置必要的英语发音字典：

./download_mfa.sh

步骤 6: 生成数据集

生成自定义唤醒词的数据集需要以下步骤：

生成 Howl 可以加载的原始音频数据集。
为每个音频文件生成正字法转录对齐。
将生成的对齐附加到步骤 1 中生成的原始音频数据集。

运行以下脚本生成数据集：

./generate_dataset.sh <common voice dataset path> <underscore separated wakeword (e.g., hey_fire_fox)> <inference sequence (e.g., [0,1,2])> <(Optional) "true" to skip negative dataset generation>

步骤 7: 训练模型

设置相关环境变量并训练模型：

source envs/res8.env
python -m training.run train -i datasets/fire/positive datasets/fire/negative --model res8 --workspace workspaces/fire-res8

如果训练数据集较小，建议使用 --use-stitched-datasets 选项。

步骤 8: 运行模型

训练完成后，可以使用以下命令运行模型进行演示：

python -m training.run demo --model res8 --workspace workspaces/fire-res8

或者使用封装好的脚本：

./train_model.sh <env file path (e.g., envs/res8.env)> <model type (e.g., res8)> <workspace path (e.g., workspaces/fire-res8)> <dataset1 (e.g., datasets/fire-positive)> <dataset2 (e.g., datasets/fire-negative)>

通过以上步骤，您可以成功安装和配置 Howl 项目，并开始使用它进行唤醒词检测的开发和测试。

howl Wake word detection modeling toolkit for Firefox Voice, supporting open datasets like Speech Commands and Common Voice. 项目地址: https://gitcode.com/gh_mirrors/how/howl

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考