7

我正在寻找在 iOS 上将口语项目转换为文本的 API,但主要用于数字和字母,如 1、2、3、4 和 a、b、c、d。

我已经按照许多人的建议尝试过 OpenEars,但它似乎只支持某些命令,例如“前进后退左右开始停止转弯”。它可以用来识别通用词或口语数字吗?

我也试过iSpeech API,但是当我说出12345这样的一串数字时,它只能返回文本“一二三四五”,它只能给我识别结果而不是猜测列表(如 Android 上的 Google 语音识别 API)。

我如何使用这些 API(或其他替代方法)来识别语音数字或字母?

4

2 回答 2

7

To learn how to create custom language models and how to dynamically create language models with OpenEars (a language model is your custom set of words), read the OpenEars docs here:

http://www.politepix.com/openears/yourapp

To learn how to use an acoustic model with OpenEars which is oriented towards recognizing spoken digits, read this discussion in the OpenEars forum:

http://www.politepix.com/forums/topic/way-to-see-phonemes-openears-heard

You can also look at the code in the OpenEars sample app, which is heavily commented and shows an example of changing the apps "vocabulary" inline. If you have more questions about implementing OpenEars, I recommend making an account on the OpenEars forums (I'm the OpenEars developer).

于 2011-09-30T19:04:24.983 回答
-1

我使用了以下基于sphinx 单元测试的 JSGF 。

<int0> = (ZERO | OH);
<int10> = TEN;
<int100> = HUNDRED;
<int1to9> = ONE | TWO | THREE | FOUR | FIVE | SIX | SEVEN | EIGHT | NINE;
<int0to9> = ( <int0> | <int1to9> );
<int01to09> = <int0> <int1to9>;
<int11to19> = ELEVEN | TWELVE | THIRTEEN | FOURTEEN | FIFTEEN | SIXTEEN | SEVENTEEN | EIGHTEEN | NINETEEN;
<tens> = TWENTY | THIRTY | FORTY | FIFTY | SIXTY | SEVENTY | EIGHTY | NINETY;
<int20to99> = ( <tens> [<int1to9>] );
<int10to99> = ( <int10> | <int11to19> | <int20to99> );
<int1to99> = ( <int1to9> | <int10to99> );
<int0to99> = ( <int0> | <int1to99> );
<int01to99> = ( <int01to09> | <int10to99> );
<int1to9hundreds> = ((A  | <int1to9>) <int100>);
<int101to999> = (<int1to9> (<int01to09> | <int10> | <int11to19> | <int20to99> ));
<int100to999> = (<int1to9hundreds> [[AND] <int1to99> ]);
<int1to999> = ( <int1to99> | <int100to999> | <int101to999> );
于 2016-01-04T20:21:03.887 回答