java - 无法在 sphinx4 中加载 en-us-semi 模型

Question

我最近的任务是用 Java 重写 C 服务器，这意味着将其语音识别功能从 Pocketsphinx C api 迁移到 Sphinx4 Java API，使用与使用 Pocketsphinx plus 相同的字典和语言模型文件CMU Sphinx 在其网站上提供的默认 en-us-semi 声学模型。注意：使用 Pocketsphinx 不需要声学模型，所以我选择了 en-us-semi 模型，认为它可以满足我的需求。StreamSpeechRecognizer这样做时，使用以下代码将 a 初始化为 Spring bean 时出现错误：

@Bean
@Autowired
public StreamSpeechRecognizer streamSpeechRecognizer(SphinxProperties sphinxProperties) throws
                                                                                      IOException {
edu.cmu.sphinx.api.Configuration sphinxConfiguration = new edu.cmu.sphinx.api.Configuration();
sphinxConfiguration.setAcousticModelPath("resource:/" + sphinxProperties.getAcousticModelPath());
sphinxConfiguration.setDictionaryPath("resource:/" +  sphinxProperties.getDictionaryPath());
sphinxConfiguration.setLanguageModelPath("resource:/" + sphinxProperties.getLanguageModelPath());

return new StreamSpeechRecognizer(sphinxConfiguration);

}

我得到的错误如下：

Caused by: java.lang.AssertionError
at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.createSenonePool(Sphinx3Loader.java:484)
at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.loadModelFiles(Sphinx3Loader.java:386)
at edu.cmu.sphinx.linguist.acoustic.tiedstate.Sphinx3Loader.load(Sphinx3Loader.java:315)
at edu.cmu.sphinx.frontend.AutoCepstrum.newProperties(AutoCepstrum.java:118)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:508)

它由StreamSpeechRecongizer的构造函数抛出。

断言失败是assert numVariances == numSenones * numGaussiansPerSenone;

此外，如果您知道我正在使用的字典文件中包含普通的英语单词（例如土豆）以及 Internet 服务名称（例如 Hotmail、Facebook、Twitter 等），您可能会很有用。

任何帮助都将不胜感激。非常感谢。

score 1 · Accepted Answer

您需要使用最新版本的 sphinx4-5prealpha，如http://cmusphinx.sourceforge.net/wiki/tutorialsphinx4中所述

它与默认的 en-us 通用 ptm 5.2 模型一起使用，这是可用的最准确的模型。您需要使用默认的 sphinx4 模型，而不是 en-us semi。最新的 pocketsphinx 使用相同的模型。

sphinx4 不支持 en-us-semi

java - 无法在 sphinx4 中加载 en-us-semi 模型

1 回答 1

Related

Reference