api - 语音中的关键字发现

Question

有人知道免费提供的关键字识别系统，并且可能提供 API 吗？

CMU Sphinx 4 和 MS Speech API 是语音识别引擎，不能用于 KWS。

SRI有一个关键字发现系统，但没有下载链接，甚至没有用于评估。（我什至在任何地方都找不到与他们联系以获取他们的软件的链接）

我在这里找到了一个，但它是一个演示并且有限。

score 4 · Accepted Answer

CMUSphinx 在pocketsphinx 引擎中实现关键字定位，详情请参阅FAQ 条目。

要识别单个关键词，您可以在“关键词搜索”模式下运行解码器。

从命令行尝试：

pocketsphinx_continuous -infile file.wav -keyphrase “oh mighty computer” -kws_threshold 1e-20

从代码：

 ps_set_keyphrase(ps, "keyphrase_search", "oh mighty computer");
 ps_set_search(ps, "keyphrase_search);
 ps_start_utt();
 /* process data */

您还可以在我们的源代码中找到 Python 和 Android/Java 的示例。Python 代码如下所示，完整示例如下：

# Process audio chunk by chunk. On keyphrase detected perform action and restart search
decoder = Decoder(config)
decoder.start_utt()
while True:
    buf = stream.read(1024)
    if buf:
         decoder.process_raw(buf, False, False)
    else:
         break
    if decoder.hyp() != None:
        print ([(seg.word, seg.prob, seg.start_frame, seg.end_frame) for seg in decoder.seg()])
        print ("Detected keyphrase, restarting search")
        decoder.end_utt()
        decoder.start_utt()

必须针对测试数据上的每个关键短语调整阈值，以获得正确的平衡漏检和误报。您可以尝试 1e-5 到 1e-50 之类的值。

为了获得最佳准确性，最好使用 3-4 个音节的关键词。太短的短语很容易混淆。

您还可以搜索多个关键词，创建一个文件 keyphrase.list 如下所示：

  oh mighty computer /1e-40/
  hello world /1e-30/
  other_phrase /other_phrase_threshold/

并在带有 -kws 配置选项的解码器中使用它。

  pocketsphinx_continuous -inmic yes -kws keyphrase_list

此功能尚未在 sphinx4 解码器中实现。

api - 语音中的关键字发现

1 回答 1

Related

Reference