java - 如何使用 Java API 从 Dialogflow CX 检测意图并将代理响应作为音频和文本获取？

Question

我正在尝试使用DialogFlow CX使用Java API开发一个简单的语音机器人。

这些是我在 Spring Boot 2.4.3 项目中的依赖项

...
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>spring-cloud-gcp-starter</artifactId>
        </dependency>
        <dependency>
            <groupId>com.google.cloud</groupId>
            <artifactId>google-cloud-dialogflow-cx</artifactId>
            <version>0.6.1</version>
        </dependency>
...

我已经使用https://github.com/googleapis/java-dialogflow-cx作为起点，到目前为止，一切似乎都运行良好......除了最重要的事情。

当我向我的代理发送文本或事件时，它会检测到意图并得到响应，但没有音频输出。因此，似乎没有执行文本到语音。

我在文档示例中执行请求的方式：

QueryInput queryInput = QueryInput
                .newBuilder()
                .setLanguageCode("es-ES")
                .setText("hola")
                .build();

DetectIntentRequest request = DetectIntentRequest.newBuilder()
                .setSession(sessionName.toString())
                .setQueryInput(queryInput)
                .build();

DetectIntentResponse response = sessionsClient.detectIntent(request);

回复：

{
  "detectIntentResponse": {
    "text": "hola",
    "languageCode": "es",
    "responseMessages": [
      {
        "text": {
          "text": [
            "¡Buenos días!"
          ]
        }
      },
      {
        
      }
    ],
    "currentPage": {
      "name": "projects/test-project/locations/global/agents/9effb8aa-6b62-4fe6-9fd5-2f5e87265ee7/flows/00000000-0000-0000-0000-000000000000/pages/START_PAGE",
      "displayName": "Start Page"
    },
    "intent": {
      "name": "projects/test-project/locations/global/agents/9effb8aa-6b62-4fe6-9fd5-2f5e87265ee7/intents/00000000-0000-0000-0000-000000000000",
      "displayName": "Default Welcome Intent"
    },
    "intentDetectionConfidence": 1.0,
    "diagnosticInfo": {
      "Execution Sequence": [
        {
          "Step 1": {
            "InitialState": {
              "FlowState": {
                "Version": 0.0,
                "PageState": {
                  "Status": "ENTERING_PAGE",
                  "Name": "Start Page"
                },
                "Name": "Default Start Flow"
              },
              "MatchedIntent": {
                "Score": 1.0,
                "Type": "NLU",
                "Active": true,
                "DisplayName": "Default Welcome Intent",
                "Id": "00000000-0000-0000-0000-000000000000"
              }
            },
            "Type": "INITIAL_STATE"
          }
        },
        {
          "Step 2": {
            "Type": "STATE_MACHINE",
            "StateMachine": {
              "FlowState": {
                "Version": 0.0,
                "Name": "Default Start Flow",
                "PageState": {
                  "Name": "Start Page",
                  "Status": "TRANSITION_ROUTING"
                }
              },
              "TriggeredIntent": "Default Welcome Intent"
            }
          }
        },
        {
          "Step 3": {
            "FunctionExecution": {
              "Responses": [
                {
                  "text": {
                    "redactedText": [
                      "¡Buenos días!"
                    ],
                    "text": [
                      "¡Buenos días!"
                    ]
                  },
                  "responseType": "HANDLER_PROMPT",
                  "source": "VIRTUAL_AGENT"
                }
              ]
            },
            "Type": "FUNCTION_EXECUTION"
          }
        },
        {
          "Step 4": {
            "Type": "STATE_MACHINE",
            "StateMachine": {
              "FlowState": {
                "Version": 0.0,
                "PageState": {
                  "Name": "Start Page",
                  "Status": "TRANSITION_ROUTING"
                },
                "Name": "Default Start Flow"
              }
            }
          }
        }
      ],
      "Alternative Matched Intents": [
        {
          "Active": true,
          "Type": "NLU",
          "Id": "00000000-0000-0000-0000-000000000000",
          "DisplayName": "Default Welcome Intent",
          "Score": 1.0
        }
      ],
      "Transition Targets Chain": [
        
      ],
      "Triggered Transition Names": [
        "9db835de-3e94-4a2a-9b8d-4eda03039e5a"
      ]
    },
    "match": {
      "intent": {
        "name": "projects/test-project/locations/global/agents/9effb8aa-6b62-4fe6-9fd5-2f5e87265ee7/intents/00000000-0000-0000-0000-000000000000",
        "displayName": "Default Welcome Intent"
      },
      "resolvedInput": "hola",
      "matchType": "INTENT",
      "confidence": 1.0
    }
  }
}

在 DialogFlow ES 中有一个启用自动文本到语音的选项，因此输出音频包含在 DetectIntentResponse 中，但我在 CX 中看不到任何类似的选项。

我在谷歌上进行了几次搜索，但找不到任何有用的东西。

所以问题是：如何使用 Java API 从 Dialogflow CX 检测意图并将代理响应作为音频和文本获取？

示例代码应该很棒！

先感谢您！

score 2 · Accepted Answer

根据文档

“如果客户端想要接收音频响应，它还应该包含 output_audio_config。”

即使我没有使用 SteamingDetectIntent，为了在响应中接收音频，也必须添加“OutputAudioConfig”。

那么代码应该是这样的：

DetectIntentRequest request = DetectIntentRequest.newBuilder()
                .setSession(sessionName.toString())
                .setQueryInput(queryInput)
                .setAudioEncoding(
                   OutputAudioEncoding.OUTPUT_AUDIO_ENCODING_MP3)
                        .build())
                .build();

DetectIntentResponse response = sessionsClient.detectIntent(request);

响应还将包含我正在寻找的outputAudio 。

{
  "outputAudio": "//NExAAAAANIAAAAALYwEAA......THE AUDIO ...ngAUYYAP/",
  "outputAudioConfig": {
    "audioEncoding": "OUTPUT_AUDIO_ENCODING_MP3"
  }
}

我希望它对某人有用。

谢谢！

java - 如何使用 Java API 从 Dialogflow CX 检测意图并将代理响应作为音频和文本获取？

1 回答 1

Related

Reference