SSML Volume 属性对输出音频没有影响


    <prosody volume = "+0dB"> This is a sentence with volume 10 For GOOGLE. </prosody>
    <s><prosody volume = "+6dB"> This is a sentence with volume 6 For GOOGLE. </prosody></s> 
    <s><prosody volume = "+24dB"> This is a sentence with volume +24 For GOOGLE. </prosody></s>
    <s><prosody volume = "+48dB"> This is a sentence with volume +48 For GOOGLE.</prosody></s> <s><prosody volume = "+196dB"> This is a sentence with volume +196 For GOOGLE.</prosody></s>


  String ssml = $"<speak><prosody volume = \"+0dB\"> This is a sentence with volume 10 For GOOGLE.</prosody>" +
                $" <s><prosody volume = \"+6dB\"> This is a sentence with volume 6 For GOOGLE.</prosody></s>" +
                $" <s><prosody volume = \"+24dB\"> This is a sentence with volume +24 For GOOGLE.</prosody></s>" +
                $" <s><prosody volume = \"+48dB\"> This is a sentence with volume +48 For GOOGLE.</prosody></s>" +
                $" <s><prosody volume = \"+196dB\"> This is a sentence with volume +196 For GOOGLE.</prosody></s>" +


    public static void Dubb(string ssml)
        var client = TextToSpeechClient.Create();

        // The input to be synthesized, can be provided as text or SSML.
        var input = new SynthesisInput
            Ssml = ssml

        // Build the voice request.
        var voiceSelection = new VoiceSelectionParams
            LanguageCode = "en-US",
            SsmlGender = SsmlVoiceGender.Female

        // Specify the type of audio file.
        var audioConfig = new AudioConfig
            AudioEncoding = AudioEncoding.Linear16

        // Perform the text-to-speech request.
        var response = client.SynthesizeSpeech(input, voiceSelection, audioConfig);

        // Write the response to the output file.
        using (var output = File.Create("output.wav"))




1 回答 1



    <prosody volume = "+0dB"> This is a sentence with volume 10 For GOOGLE. </prosody>
    <s><prosody volume = "+6dB"> This is a sentence with volume 6 For GOOGLE. </prosody></s> 
    <s><prosody volume = "+24dB"> This is a sentence with volume +24 For GOOGLE. </prosody></s>
    <s><prosody volume = "+48dB"> This is a sentence with volume +48 For GOOGLE.</prosody></s> <s><prosody volume = "+196dB"> This is a sentence with volume +196 For GOOGLE.</prosody></s>

TTS UI上,它确实按预期工作。


从那里您可以将其导出为 JSON(也许它可以帮助您)。

  "audioConfig": {
    "audioEncoding": "LINEAR16",
    "pitch": 0,
    "speakingRate": 1
  "input": {
    "ssml": "<speak> <prosody volume = \"+0dB\"> This is a sentence with volume 10 For GOOGLE. </prosody> <s><prosody volume = \"+6dB\"> This is a sentence with volume 6 For GOOGLE. </prosody></s> <s><prosody volume = \"+24dB\"> This is a sentence with volume +24 For GOOGLE. </prosody></s> <s><prosody volume = \"+48dB\"> This is a sentence with volume +48 For GOOGLE.</prosody></s> <s><prosody volume = \"+196dB\"> This is a sentence with volume +196 For GOOGLE.</prosody></s> </speak>"
  "voice": {
    "languageCode": "en-US",
    "name": "en-US-Standard-A"
于 2021-12-26T21:31:36.707 回答