c# - ASP.NET MVC 中的超快速文本转语音 (WAV -> MP3)

Question

这个问题本质上是关于 Microsoft 的 Speech API (SAPI) 对于服务器工作负载的适用性，以及它是否可以在w3wp内部可靠地用于语音合成。我们有一个异步控制器，它使用System.Speech.NET 4 中的本机程序集（不是Microsoft.Speech作为 Microsoft 语音平台 - 运行时版本 11 的一部分提供的程序集）和lame.exe 来生成 mp3，如下所示：

       [CacheFilter]
        public void ListenAsync(string url)
        {
                string fileName = string.Format(@"C:\test\{0}.wav", Guid.NewGuid());                       

                try
                {
                    var t = new System.Threading.Thread(() =>
                    {
                        using (SpeechSynthesizer ss = new SpeechSynthesizer())
                        {
                            ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050, AudioBitsPerSample.Eight, AudioChannel.Mono));
                            ss.Speak("Here is a test sentence...");
                            ss.SetOutputToNull();
                            ss.Dispose();
                        }

                        var process = new Process() { EnableRaisingEvents = true };
                        process.StartInfo.FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"bin\lame.exe");
                        process.StartInfo.Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3"));
                        process.StartInfo.UseShellExecute = false;
                        process.StartInfo.RedirectStandardOutput = false;
                        process.StartInfo.RedirectStandardError = false;
                        process.Exited += (sender, e) =>
                        {
                            System.IO.File.Delete(fileName);

                            AsyncManager.OutstandingOperations.Decrement();
                        };

                        AsyncManager.OutstandingOperations.Increment();
                        process.Start();
                    });

                    t.Start();
                    t.Join();
                }
                catch { }

            AsyncManager.Parameters["fileName"] = fileName;
        }

        public FileResult ListenCompleted(string fileName)
        {
            return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
        }

问题是为什么SpeechSynthesizer需要在这样的单独线程上运行才能返回（这在 SO here和here的其他地方有报告）以及为此请求实现STAThreadRouteHandler是否比上述方法更有效/可扩展？

SpeakAsync其次，在 ASP.NET（MVC 或 WebForms）上下文中运行的选项有哪些？我尝试过的所有选项似乎都不起作用（请参阅下面的更新）。

欢迎任何其他关于如何改进这种模式的建议（即两个必须彼此串行执行但每个都支持异步的依赖项）。我觉得这个方案在负载下是不可持续的，特别是考虑到SpeechSynthesizer. 考虑在不同的堆栈上一起运行此服务。

更新：Speak或SpeakAsnc选项似乎都不适用于STAThreadRouteHandler. 前者产生：

System.InvalidOperationException：在此上下文中不允许异步操作。启动异步操作的页面必须将 Async 属性设置为 true，并且只能在 PreRenderComplete 事件之前的页面上启动异步操作。在 System.Web.LegacyAspNetSynchronizationContext.OperationStarted() 在 System.ComponentModel.AsyncOperationManager.CreateOperation(Object userSuppliedState) 在 System.Speech.Internal.Synthesis.VoiceSynthesis..ctor(WeakReference speechSynthesizer) 在 System.Speech.Synthesis.SpeechSynthesizer.get_VoiceSynthesizer( ) 在 System.Speech.Synthesis.SpeechSynthesizer.SetOutputToWaveFile（字符串路径，SpeechAudioFormatInfo 格式信息）

后者导致：

System.InvalidOperationException：无法同步执行异步操作方法“Listen”。在 System.Web.Mvc.Async.AsyncActionDescriptor.Execute（ControllerContext 控制器上下文，IDictionary`2 参数）

似乎自定义 STA 线程池（带有ThreadStaticCOM 对象的实例）是一种更好的方法： http: //marcinbudny.blogspot.ca/2012/04/dealing-with-sta-coms-in-web.html

更新 #2：似乎System.Speech.SpeechSynthesizer不需要 STA 处理，只要您遵循该Start/Join 模式，似乎在 MTA 线程上运行良好。这是一个能够正确使用的新版本SpeakAsync（问题是过早地处理它！）并将 WAV 生成和 MP3 生成分解为两个单独的请求：

[CacheFilter]
[ActionName("listen-to-text")]
public void ListenToTextAsync(string text)
{
    AsyncManager.OutstandingOperations.Increment();   

    var t = new Thread(() =>
    {
        SpeechSynthesizer ss = new SpeechSynthesizer();
        string fileName = string.Format(@"C:\test\{0}.wav", Guid.NewGuid());

        ss.SetOutputToWaveFile(fileName, new SpeechAudioFormatInfo(22050,
                                                                   AudioBitsPerSample.Eight,
                                                                   AudioChannel.Mono));
        ss.SpeakCompleted += (sender, e) =>
        {
            ss.SetOutputToNull();
            ss.Dispose();

            AsyncManager.Parameters["fileName"] = fileName;
            AsyncManager.OutstandingOperations.Decrement();
        };

        CustomPromptBuilder pb = new CustomPromptBuilder(settings.DefaultVoiceName);
        pb.AppendParagraphText(text);
        ss.SpeakAsync(pb);               
    });

    t.Start();
    t.Join();                    
}

[CacheFilter]
public ActionResult ListenToTextCompleted(string fileName)
{
    return RedirectToAction("mp3", new { fileName = fileName });
}

[CacheFilter]
[ActionName("mp3")]
public void Mp3Async(string fileName) 
{
    var process = new Process()
    {
        EnableRaisingEvents = true,
        StartInfo = new ProcessStartInfo()
        {
            FileName = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, @"bin\lame.exe"),
            Arguments = string.Format("-V2 {0} {1}", fileName, fileName.Replace(".wav", ".mp3")),
            UseShellExecute = false,
            RedirectStandardOutput = false,
            RedirectStandardError = false
        }
    };

    process.Exited += (sender, e) =>
    {
        System.IO.File.Delete(fileName);
        AsyncManager.Parameters["fileName"] = fileName;
        AsyncManager.OutstandingOperations.Decrement();
    };

    AsyncManager.OutstandingOperations.Increment();
    process.Start();
}

[CacheFilter]
public ActionResult Mp3Completed(string fileName) 
{
    return base.File(fileName.Replace(".wav", ".mp3"), "audio/mp3");
}

score 4 · Accepted Answer

I/O 在服务器上非常昂贵。你认为你可以在服务器硬盘上获得多少个 wav 写入流？为什么不全部在内存中完成，只在完全处理后才写入 mp3？mp3 的体积要小得多，而且 I/O 会占用很短的时间。如果需要，您甚至可以更改代码以将流直接返回给用户，而不是保存到 mp3。

如何使用 LAME 将 wav 编码为 mp3 c#

score 0 · Accepted Answer

这个问题现在有点老了，但这就是我正在做的事情，到目前为止效果很好：

    public Task<FileStreamResult> Speak(string text)
    {
        return Task.Factory.StartNew(() =>
        {
            using (var synthesizer = new SpeechSynthesizer())
            {
                var ms = new MemoryStream();
                synthesizer.SetOutputToWaveStream(ms);
                synthesizer.Speak(text);

                ms.Position = 0;
                return new FileStreamResult(ms, "audio/wav");
            }
        });
    }

可能会帮助某人...

c# - ASP.NET MVC 中的超快速文本转语音 (WAV -> MP3)

2 回答 2

Related

Reference