c# - C#：使用 System.Speech 命名空间将 WAV 文件转录为文本（语音到文本）

Question

如何使用 .NET 语音命名空间类将WAV文件中的音频转换为可以在屏幕上显示或保存到文件的文本形式？

我正在寻找一些教程示例。

更新

在这里找到了一个代码示例。但是当我尝试它时，它给出了不正确的结果。下面是我采用的 vb 代码示例。（实际上我不介意语言，只要它是 vb/c#...）。它没有给我正确的结果。我假设如果我们输入正确的语法——即我们在录音中期望的单词——我们应该得到它的文本输出。首先，我尝试使用通话中的示例词。它有时只打印那个（一个）单词而没有其他任何东西。然后我尝试了我们在录音中完全没想到的单词......不幸的是它也打印出来了...... :(

Imports System
Imports System.Speech.Recognition

Public Class Form1

    Dim WithEvents sre As SpeechRecognitionEngine

    Private Sub btnLiterate_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnLiterate.Click
        If TextBox1.Text.Trim.Length = 0 Then Exit Sub
        sre.SetInputToWaveFile(TextBox1.Text)
        Dim r As RecognitionResult
        r = sre.Recognize()
        If r Is Nothing Then
            TextBox2.Text = "Could not fetch result"
            Return
        End If
        TextBox2.Text = r.Text
    End Sub

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        TextBox1.Text = String.Empty
        Dim dr As DialogResult
        dr = OpenFileDialog1.ShowDialog()
        If dr = Windows.Forms.DialogResult.OK Then
            If Not OpenFileDialog1.FileName.Contains("wav") Then
                MessageBox.Show("Incorrect file")
            Else
                TextBox1.Text = OpenFileDialog1.FileName
            End If
        End If
    End Sub

    Public Sub New()

        ' This call is required by the Windows Form Designer.
        InitializeComponent()

        sre = New SpeechRecognitionEngine()

    End Sub

    Private Sub sre_LoadGrammarCompleted(ByVal sender As Object, ByVal e As System.Speech.Recognition.LoadGrammarCompletedEventArgs) Handles sre.LoadGrammarCompleted

    End Sub

    Private Sub sre_SpeechHypothesized(ByVal sender As Object, ByVal e As System.Speech.Recognition.SpeechHypothesizedEventArgs) Handles sre.SpeechHypothesized
        System.Diagnostics.Debug.Print(e.Result.Text)
    End Sub

    Private Sub sre_SpeechRecognitionRejected(ByVal sender As Object, ByVal e As System.Speech.Recognition.SpeechRecognitionRejectedEventArgs) Handles sre.SpeechRecognitionRejected
        System.Diagnostics.Debug.Print("Rejected: " & e.Result.Text)
    End Sub

    Private Sub sre_SpeechRecognized(ByVal sender As Object, ByVal e As System.Speech.Recognition.SpeechRecognizedEventArgs) Handles sre.SpeechRecognized
        System.Diagnostics.Debug.Print(e.Result.Text)
    End Sub

    Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        Dim words As String() = New String() {"triskaidekaphobia"}
        Dim c As New Choices(words)
        Dim grmb As New GrammarBuilder(c)
        Dim grm As Grammar = New Grammar(grmb)
        sre.LoadGrammar(grm)
    End Sub

End Class

更新（11月28日之后）

找到了一种加载默认语法的方法。它是这样的：

sre.LoadGrammar(New DictationGrammar)

这里仍然存在问题。识别不准确。输出是垃圾。对于一个 6 分钟的文件，它可能会提供 5-6 个与语音文件完全无关的文本单词。

score 8 · Accepted Answer

System.Speech 中的类用于文本到语音（主要是辅助功能）。

您正在寻找语音识别。自 .Net 3.0 起就有可用的System.Speech.Recognition命名空间。它使用 Windows 桌面语音引擎。这可能会让你开始，但我想那里有更好的引擎。

语音识别非常复杂，很难做对，也有一些商业产品可用。

score 1 · Accepted Answer

我意识到这是一个老问题，但在后面的问题和答案中有更好的信息。例如，请参阅在 asp.net Web 应用程序中将语音转文本的最佳选择是什么？

您可以调用 SetInputToWaveFile() 来读取音频文件，而不是调用 SetInputToDefaultAudioDevice()。

Windows Vista 和 Windows 7 中的桌面识别引擎包括一个听写语法，如参考答案中所示。

score 0 · Accepted Answer

您实际上需要自然语言工具包。在 python 中，我使用了 NTLK http://www.nltk.org/

在 .Net 中，我刚刚找到 Antelope https://stackoverflow.com/questions/1762040/natural-language-toolkit-equivalent-in-c

参见文章http://en.wikipedia.org/wiki/Speech_recognition

score 0 · Accepted Answer

你应该使用SpeechRecognitionEngine. 要使用波形文件，请调用SetInputToWaveFile. 我希望我能帮助你更多，但我不是专家。

哦，如果你的话是真的triskaidekaphobia，我认为即使是人类语音识别引擎也不会识别......

score 0 · Accepted Answer

我已经测试了你的代码，但它没有正确抓取波形文件。它正在捕捉

If Not OpenFileDialog1.FileName.Contains("wav") Then MessageBox.Show("Incorrect file") Else TextBox1.Text = OpenFileDialog1.FileName End If

不是 else 条件。我也尝试在字符串中使用 .wav 。

我还需要一个示例代码，用于将 wav 文件转录为不是来自麦克风的文本。如果您找到了一个好的解决方案，请在此处发布。

c# - C#：使用 System.Speech 命名空间将 WAV 文件转录为文本（语音到文本）

更新

更新（11月28日之后）

5 回答 5

Related

Reference