0

我下载了 CMU SphinxBase (sphinxbase-5pr​​ealpha.tar.gz) 和 Pocket Sphinx (pocketsphinx-5prealpha.tar.gz) 并安装了所有必需的软件包 (sudo apt-get libtool bison python-dev autotools swig) 并运行了所有步骤 ( http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx)。

在我的 RPI 上,我跑了> pocketsphinx_continuous -inmic 是的,我有一个 USB Logitech 网络摄像头,它在 Google API V2 上表现良好。

我会说所有我知道的英语单词和pocketsphinx_continuous。它给了我如下信息。我希望它会做一些识别,我会开始改进它,但是零识别,我不知道如何改进。

READY....
Listening...
INFO: cmn_prior.c(131): cmn_prior_update: from < 40.00  3.00 -1.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 >
INFO: cmn_prior.c(149): cmn_prior_update: to   < 34.68 -4.34  8.66 -9.45 -0.21 -2.80  2.86  1.73  6.98  5.36  4.14  0.69  1.67 >
INFO: ngram_search_fwdtree.c(1553):      961 words recognized (7/fr)
INFO: ngram_search_fwdtree.c(1555):   497161 senones evaluated (3551/fr)
INFO: ngram_search_fwdtree.c(1559):  1453632 channels searched (10383/fr), 98192 1st, 13846 last
INFO: ngram_search_fwdtree.c(1562):     2097 words for which last channels evaluated (14/fr)
INFO: ngram_search_fwdtree.c(1564):    40961 candidate words for entering last phone (292/fr)
INFO: ngram_search_fwdtree.c(1567): fwdtree 11.18 CPU 7.986 xRT
INFO: ngram_search_fwdtree.c(1570): fwdtree 24.17 wall 17.265 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 6 words
INFO: ngram_search_fwdflat.c(948):      696 words recognized (5/fr)
INFO: ngram_search_fwdflat.c(950):     8170 senones evaluated (58/fr)
INFO: ngram_search_fwdflat.c(952):     4239 channels searched (30/fr)
INFO: ngram_search_fwdflat.c(954):      940 words searched (6/fr)
INFO: ngram_search_fwdflat.c(957):      276 word transitions (1/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.86 CPU 0.614 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 1.77 wall 1.265 xRT
INFO: ngram_search.c(1253): lattice start node <s>.0 end node </s>.47
INFO: ngram_search.c(1279): Eliminated 2 nodes before end node
INFO: ngram_search.c(1384): Lattice has 243 nodes, 194 links
INFO: ps_lattice.c(1380): Bestpath score: -1185
INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:47:138) = -75028
INFO: ps_lattice.c(1441): Joint P(O,S) = -97858 P(S|O) = -22830
INFO: ngram_search.c(875): bestpath 0.01 CPU 0.007 xRT
INFO: ngram_search.c(878): bestpath 0.02 wall 0.015 xRT
READY....
Listening...
Input overrun, read calls are too rare (non-fatal)
INFO: ngram_search.c(467): Resized score stack to 200000 entries
INFO: ngram_search_fwdtree.c(952): cand_sf[] increased to 64 entries
INFO: ngram_search.c(459): Resized backpointer table to 10000 entries
INFO: ngram_search.c(467): Resized score stack to 400000 entries
Input overrun, read calls are too rare (non-fatal)
INFO: ngram_search.c(459): Resized backpointer table to 20000 entries
Input overrun, read calls are too rare (non-fatal)
Input overrun, read calls are too rare (non-fatal)
4

1 回答 1

1

在 Raspberry Pi 上无法识别大词汇量的语音,这太慢了。您在日志中看到它的运行速度比实时慢 17 倍。

如果您仍想在设备上识别,您可以将数据流式传输到服务器或配置小语法进行识别。

于 2015-11-18T20:21:20.720 回答