android - CMUSphinx PocketSphinx - Recognize all (or large amount) of words

Question

Before I tried to used PocketSphinx for Android, I used Google's voice recognition API. I didn't need to set a search name or a dictionary file. It just recognized every word that was told.

Now, In PocketSphinx, I need to do it. But I can only find how to set recognition for one word, Or to set dictionary (The ones available in the demo project have only few words) that the recognizer think these are the only words exist, Which means that if someone says something similar, The recognizer thinks its the word that listed in the dictionary.

I just want to ask, How could I set a few search names, Or how could I set it to recognize all the words available (or even a large amount of them)? Maybe someone has a dictionary file with a big number of words?

score 17 · Accepted Answer

在我尝试使用 PocketSphinx for Android 之前，我使用了 Google 的语音识别 API。我不需要设置搜索名称或字典文件。它只认得每一个字。

Google API 也可以识别大量但仍然有限的单词集。很长一段时间它都无法识别“Spotify”。谷歌离线语音识别器使用了大约 50k 个单词，如其出版物中所述。

我只想问，我怎么能设置几个搜索名称，或者我怎么能设置它来识别所有可用的单词（甚至是大量的单词）？也许有人有一个包含大量单词的字典文件？

演示包括带有语言模型的大词汇量语音识别（预测部分）。有更大的英语语言模型可供下载，例如En-US 通用语言模型。

运行识别的简单代码如下：

 recognizer = defaultSetup()
   .setAcousticModel(new File(assetsDir, "en-us-ptm"))
   .setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
   .getRecognizer();
  recognizer.addListener(this);

  // Create keyword-activation search.
  recognizer.addNgramSearch(NGRAM_SEARCH, new File(assetsDir, "en-us.lm.bin"););

  // Start the search
  recognizer.startListening(NGRAM_SEARCH);

但是，它们不容易安装到设备中并实时解码。如果您想用大量词汇实时解码语音，您需要将音频流式传输到服务器。或者您需要将词汇和语言限制为通用英语的一小部分。您可以在教程中了解更多关于 CMUSphinx 中语音识别的信息。

android - CMUSphinx PocketSphinx - Recognize all (or large amount) of words

1 回答 1

Related

Reference