ros下语音调试-2

最新推荐文章于 2024-05-12 13:32:09 发布

kris_zhao

最新推荐文章于 2024-05-12 13:32:09 发布

阅读量498

点赞数

分类专栏：语音 ubuntu14.04

ubuntu14.04 同时被 2 个专栏收录

25 篇文章

订阅专栏

语音

4 篇文章

订阅专栏

1. Install pocketsphinx

  Sphinx是由美国卡内基梅隆大学开发的大词汇量、非特定人、连续英语语音识别系统。一个连续语音识别系统大致可分为四个部分：特征提取，声学模型训练，语言模型训练和解码器。
  
  PocketSphinx是一个计算量和体积都很小的嵌入式语音识别引擎。在Sphinx-2的基础上针对嵌入式系统的需求修改、优化而来，是第一个开源面向嵌入式的中等词汇量连续语音识别项目。识别精度和Sphinx-2差不多。

  CMU Pocket Sphinx speech recognizer uses gstreamer to automatically split the incoming audio into utterances to be recognized, and offers services to start and stop recognition.Currently, the recognizer requires a language model and dictionary file. These can be automatically built from a corpus of sentances using the Online Sphinx Knowledge Base Tool.

  sound_play provides a ROS node that translates commands on a ROS topic (robotsound) into sounds. The node supports built-in sounds, playing OGG/WAV files, and doing speech synthesis via festival. C++ and Python bindings allow this node to be used without understanding the details of the message format, allowing faster development and resilience to message format changes.

  The sound_play package uses the CMU Festival TTS library to generate synthetic speech.

  我们使用的是现成的语言模型和字典文件

  语音识别基础知识：   http://blog.youkuaiyun.com/zouxy09/article/details/7941585
  cmu sphinx官网 ：   http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx
  pocketsphinx安装编译： http://blog.youkuaiyun.com/zouxy09/article/details/7942784/

  1) 下载并解压到同一个目录 sphinxbase-5prealpha & pocketsphinx-5prealpha  http://cmusphinx.sourceforge.net/wiki/tutorialoverview

  2) 需要有这些依赖项
     gcc, automake, autoconf, libtool, bison, swig at least version 2.0, python development package, pulseaudio development package

  3) 把没有的依赖项安装完后不再报错，进入解压后的sphinxbase-5prealpha文件夹
  4) $ ./autogen.sh
     $ ./configure
     $ make
     $ sudo make install

  5) 设置环境变量
     $ export LD_LIBRARY_PATH=/usr/local/lib
     $ export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig

  6) 进入pocketsphinx-5prealpha文件夹
     $ ./configure
     $ make
     $ sudo make install

   7) 测试是否能识别
     $ pocketsphinx_continuous -inmic yes

2.Debs installation 安装 Turtlebot 及远程控制 Turtlebot (已安装 ros-hydro-desktop-full)

 http://www.cnblogs.com/cv-pr/p/5015657.html
 Turtlebot上搭载一台主机A，作为主机Master,有自带的电源和3D传感器，roscore在这台机器上启动。pc电脑远程连接A，和A通讯，pc不需要启动roscore,可以在远程pc上控制Turtlebot.

 $ sudo apt-get install ros-hydro-turtlebot ros-hydro-turtlebot-apps ros-hydro-turtlebot-viz ros-hydro-turtlebot-simulator ros-hydro-kobuki-ftdi

 添加底盘 Kobuki的 udev rules
 $ . /opt/ros/hydro/setup.bash

 $ rosrun kobuki_ftdi create_udev_rules
 
 配置环境变量
 $  echo "source /opt/ros/hydro/setup.bash" >> ~/.bashrc
 
 时间同步

3.创建语音库

 任意文件夹下创建一个*.txt文本文档，将所需识别的句子写入该文档。写成单列，如：
 turn around
 go forward
 stop
 注意：文档中不能有任何标点符号，如 将 don't 写成 do not 或dont，将54 写成 fifty four.保存退出。
 
 利用在线工具LMTool建立语言模型和语音库
 进入  http://www.speech.cs.cmu.edu/tools/lmtool-new.html
 载入.txt文本，点击'Compile knowledge Base'
 下载标注为'COMPRESSED TARBALL'的压缩文件，然后解压
 进入解压后的文件夹，更改各个文件的名字，如  $ rename -f 's/3026/nav_commands/' *
 
 测试： （pocketsphinx_continuous解码器用 -lm选项来指定要加载的语言模型，-dict来指定要加载的字典）
 
 打开Terminal，输入命令pocketsphinx_continuous -inmic yes -dict /..(此处为上一步中提取的位置路径)/****.dic（上一步中获取的四位数字） -lm /..(此处为上一步中提取的位置路径)/****.lm（上一步中获取的四位数字），运行程序即可

4.GPSR 运行

 $ roscore
 新的终端
 $ cd catkin_ws/src/pi_speech_tutorial_master/launch
 $ roalaunch talkback_gpsr.launch
 新的终端
 $ pocketsphinx_continuous -inmic yes -dict /home/../Speech_test.dic -lm /home/../Speech_test.lm
 新的终端
 $ rostopic list 查看节点和话题
 $rostopic echo /talkback   查看话题上的消息

 socket是用来创建topic的，因为语音没有话题可以发出。