Sphinx4使用小记

最新推荐文章于 2024-10-09 08:57:04 发布

znr1995

最新推荐文章于 2024-10-09 08:57:04 发布

阅读量4.3k

点赞数

分类专栏： sphinx4

本文链接：https://blog.youkuaiyun.com/znr1995/article/details/70805880

版权

sphinx4 专栏收录该内容

2 篇文章

订阅专栏

本文记录了Sphinx4的使用体验，包括如何运行demo，解析语法文件以及配置文件的作用。通过修改语法文件和配置文件，可以定制识别内容，提高识别效率。Sphinx4的配置灵活性和库文件的指示功能使得语音识别应用更加便捷。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Sphinx4使用小记

下载sphinx4-bin类型的这个文件，解压后，进入bin文件夹，会看到一系列的.jar文件，这些都是系统自带的demo。

接下来，运行demo。进入命令行，windows下cmd，转换到上面有HelloWorld.jar的bin的文件夹下，再命令行里输入（需要系统安装java环境）

java -jar HelloWorld.jar

这时候，就能看到程序启动。如果没有问题的话。

会在命令行中输出

Start speaking.Press Ctrl-c to quit.

说话，就会有对应的识别结果。

由于语法库的限定，只能识别上面列出来的条目，这个我会在后面做说明（其实库还是挺大的）。

更近一步

经过以上，对sphinx4进行入门，下来，我们进入HelloWorld.jar文件去看看。

使用解压软件（我用的好压）解压文件，里面包含2个文件夹，

edu\…\HelloWorld — 包含语法文件和运行demo的代码,和配置文件

hello.gram

一下是里面的内容

\#JSGF V1.0;

/**

JSGF Grammar for Hello World example
*/

grammar hello;

public <greet> = (Good morning| Hello) ( Bhiksha | Evandro | Paul | Philip | Rita | Will );

是不是很熟悉，这个就决定demo能识别什么，如果你把Good morning 改为Good bye再压缩回去运行，一样可以识别。(读者可以试试，基本你的词汇量不会比库大，除了名字吧，修改完成后再用好压压缩，改后缀为.jar，就可以)

本质在于，这个demo只限定能识别这个语法格式设置的语法语音，只要你这里出现的词汇在sphinx4的语音库中，它会自动识别判断的，这里其实只是一个限定，相当于提高了识别率。

HelloWorld.class

这个文件需要用反编译的工具打开（jd_gui），里面就是实现这个demo的代码。

package edu.cmu.sphinx.demo.helloworld;

import edu.cmu.sphinx.frontend.util.Microphone;
import edu.cmu.sphinx.recognizer.Recognizer;
import edu.cmu.sphinx.result.Result;
import edu.cmu.sphinx.util.props.ConfigurationManager;
import java.io.PrintStream;

public class HelloWorld
{
public static void main(String[] args)
{
  ConfigurationManager cm;
  ConfigurationManager cm;
  if (args.length > 0) {
    cm = new ConfigurationManager(args[0]);
  } else {
    cm = new ConfigurationManager(HelloWorld.class.getResource("helloworld.config.xml"));
  }
  Recognizer recognizer = (Recognizer)cm.lookup("recognizer");
  recognizer.allocate();

  Microphone microphone = (Microphone)cm.lookup("microphone");
  if (!microphone.startRecording())
  {
    System.out.println("Cannot start microphone.");
    recognizer.deallocate();
    System.exit(1);
  }
  System.out.println("Say: (Good morning | Hello) ( Bhiksha | Evandro | Paul | Philip | Rita | Will )");
  for (;;)
  {
    System.out.println("Start speaking. Press Ctrl-C to quit.\n");

    Result result = recognizer.recognize();
    if (result != null)
    {
      String resultText = result.getBestFinalResultNoFiller();
      System.out.println("You said: " + resultText + '\n');
    }
    else
    {
      System.out.println("I can't hear what you said.\n");
    }
  }
}
}

Helloworld.comfig.xml

这个xml就是sphinx4的配置文件，如果需要在哪里修改语音识别的参数（采样率，使用不同的算法去处理语音，都可以通过这个配置函数，而不需要去修改代码，对于可扩展性，可读性什么的都是很好的，值得学习！），目前没有仔细研究。

MFTA-INF — 指示库文件在哪里

Manifest-Version: 1.0
Ant-Version: Apache Ant 1.8.0
Created-By: 1.6.0_20-b02 (Sun Microsystems Inc.)
Main-Class: edu.cmu.sphinx.demo.helloworld.HelloWorld
Class-Path: ../lib/sphinx4.jar ../lib/WSJ_8gau_13dCep_16k_40mel_130Hz_
6800Hz.jar

这是这个文件夹中文件的内容，容易看出，这个文件指示mainclass的相对位置，库函数的相对位置。

如果你把这个文件移到其他位置，再去运行，就会报错，找不到一大堆类。

sphinx4小记-pic1