c# - 如何以编程方式训练 SpeechRecognitionEngine 并将音频文件转换为 C# 或 vb.net 中的文本

Question

是否可以以编程方式训练识别器提供 .wavs 而不是与麦克风交谈？

如果是这样，怎么做？，目前我有对0.wav文件中的音频执行识别并将识别的文本写入控制台的代码。

Imports System.IO
Imports System.Speech.Recognition
Imports System.Speech.AudioFormat

Namespace SampleRecognition
    Class Program
        Shared completed As Boolean

        Public Shared Sub Main(ByVal args As String())
            Using recognizer As New SpeechRecognitionEngine()
                Dim dictation As Grammar = New DictationGrammar()
                dictation.Name = "Dictation Grammar"
                recognizer.LoadGrammar(dictation)
                ' Configure the input to the recognizer.
                recognizer.SetInputToWaveFile("C:\Users\ME\v02\0.wav")

                ' Attach event handlers for the results of recognition.
                AddHandler recognizer.SpeechRecognized, AddressOf recognizer_SpeechRecognized
                AddHandler recognizer.RecognizeCompleted, AddressOf recognizer_RecognizeCompleted

                ' Perform recognition on the entire file.
                Console.WriteLine("Starting asynchronous recognition...")
                completed = False
                recognizer.RecognizeAsync()
                ' Keep the console window open.
                While Not completed
                    Console.ReadLine()
                End While
                Console.WriteLine("Done.")
            End Using

            Console.WriteLine()
            Console.WriteLine("Press any key to exit...")
            Console.ReadKey()
        End Sub

        ' Handle the SpeechRecognized event.
        Private Shared Sub recognizer_SpeechRecognized(ByVal sender As Object, ByVal e As SpeechRecognizedEventArgs)
            If e.Result IsNot Nothing AndAlso e.Result.Text IsNot Nothing Then
                Console.WriteLine("  Recognized text =  {0}", e.Result.Text)
            Else
                Console.WriteLine("  Recognized text not available.")
            End If
        End Sub

        ' Handle the RecognizeCompleted event.
        Private Shared Sub recognizer_RecognizeCompleted(ByVal sender As Object, ByVal e As RecognizeCompletedEventArgs)
            If e.[Error] IsNot Nothing Then
                Console.WriteLine("  Error encountered, {0}: {1}", e.[Error].[GetType]().Name, e.[Error].Message)
            End If
            If e.Cancelled Then
                Console.WriteLine("  Operation cancelled.")
            End If
            If e.InputStreamEnded Then
                Console.WriteLine("  End of stream encountered.")
            End If
            completed = True
        End Sub
    End Class
End Namespace

编辑

我了解使用培训向导对执行此操作很有用

通过打开语音识别，单击开始按钮->控制面板->轻松访问->语音识别来完成

.

如何使用自定义 wav 甚至 mp3 文件自定义训练语音识别？

使用培训向导（控制面板培训 UI）时，培训文件存储在 {AppData}\Local\Microsoft\Speech\Files\TrainingAudio中。

如何使用或进行自定义培训而不是使用培训向导？

语音控制面板在HKCU\Software\Microsoft\Speech\RecoProfiles\Tokens{ProfileGUID}{00000000-0000-0000-0000-0000000000000000}\Files键中为训练音频文件创建注册表项

由代码创建的注册表项是否必须放置在那里？

这样做的原因是我想用我自己的 wav 文件和单词和短语列表自定义训练，然后将所有内容转移到其他系统。

score 5 · Accepted Answer

当然可以使用 C# 训练 SAPI。您可以使用 SAPI 周围的语音库包装器从 C# 访问训练模式 API。@ Eric Brown 回答了该过程

创建一个 inproc 识别器并绑定适当的音频输入。
确保您保留音频以供识别；你稍后会需要它。
创建一个包含要训练的文本的语法。
设置语法的状态以在识别发生时暂停识别器。（这也有助于从音频文件进行训练。）

发生识别时：
获取识别的文本和保留的音频。
使用 CoCreateInstance(CLSID_SpStream) 创建一个流对象。
使用 ISpRecognizer::GetObjectToken 和 ISpObjectToken::GetStorageFileName 创建训练音频文件，并将其绑定到流（使用 ISpStream::BindToFile ）。
将保留的音频复制到流对象中。
QI ISpTranscript 接口的流对象，并使用 ISpTranscript::AppendTranscript 将识别的文本添加到流中。
更新下一个话语的语法，恢复识别器，然后重复直到训练文本结束。

其他选项可以使用所需的输出训练 sapi 一次，然后使用代码获取配置文件并将其传输到其他系统，以下代码返回一个 ISpeechObjectTokens 对象：

GetProfiles 方法返回可用用户语音配置文件的选择。配置文件作为一系列令牌存储在语音配置数据库中，每个令牌代表一个配置文件。GetProfiles 检索所有可用的配置文件令牌。返回的列表是一个 ISpeechObjectTokens 对象。与 ISpeechObjectTokens 关联的方法中提供了有关令牌的其他或更详细的信息。令牌搜索可以使用RequiredAttributes 和OptionalAttributes 搜索属性进一步细化。仅返回与指定的RequiredAttributes 搜索属性匹配的标记。在与RequiredAttributes 键匹配的那些标记中，OptionalAttributes 按与OptionalAttributes 匹配的顺序列出设备。如果没有提供搜索属性，则返回所有标记。有关 SAPI 5 定义的属性列表，请参阅对象令牌和注册表设置白皮书。

Public SharedRecognizer As SpSharedRecognizer
Public theRecognizers As ISpeechObjectTokens

Private Sub Command1_Click()
    On Error GoTo EH

    Dim currentProfile As SpObjectToken
    Dim i As Integer
    Dim T As String
    Dim TokenObject As ISpeechObjectToken
    Set currentProfile = SharedRecognizer.Profile

    For i = 0 To theRecognizers.Count - 1
        Set TokenObject = theRecognizers.Item(i)

        If tokenObject.Id <> currentProfile.Id Then
            Set SharedRecognizer.Profile = TokenObject
            T = "New Profile installed: "
            T = T & SharedRecognizer.Profile.GetDescription
            Exit For
        Else
            T = "No new profile has been installed."
        End If
    Next i

    MsgBox T, vbInformation

EH:
    If Err.Number Then ShowErrMsg
End Sub

Private Sub Form_Load()
    On Error GoTo EH

    Const NL = vbNewLine
    Dim i, idPosition As Long
    Dim T As String
    Dim TokenObject As SpObjectToken

    Set SharedRecognizer = CreateObject("SAPI.SpSharedRecognizer")
    Set theRecognizers = SharedRecognizer.GetProfiles

    For i = 0 To theRecognizers.Count - 1
        Set TokenObject = theRecognizers.Item(i)
        T = T & TokenObject.GetDescription & "--" & NL & NL
        idPosition = InStrRev(TokenObject.Id, "\")
        T = T & Mid(TokenObject.Id, idPosition + 1) & NL
    Next i

    MsgBox T, vbInformation

EH:
    If Err.Number Then ShowErrMsg
End Sub

Private Sub ShowErrMsg()

    ' Declare identifiers:
    Dim T As String

    T = "Desc: " & Err.Description & vbNewLine
    T = T & "Err #: " & Err.Number
    MsgBox T, vbExclamation, "Run-Time Error"
    End

End Sub

score 2 · Accepted Answer

您可以使用 SAPI 引擎（不是托管 API）生成自定义培训

这是一个关于如何做的链接（虽然有点模糊）

c# - 如何以编程方式训练 SpeechRecognitionEngine 并将音频文件转换为 C# 或 vb.net 中的文本

编辑

2 回答 2

Related

Reference