actions-on-google - 如何让 Actions on Google 流式传输音频？

Question

我正在编写一个与 Google Actions 配合使用的应用程序。唯一令人遗憾的是，我找不到任何有关如何形成我的响应的信息，以便 Google 将从给定的 URL 流式传输音频。谷歌甚至支持这个吗？

我已经在 Alexa 上编写了相同的应用程序，而在 Alexa 上，您只需返回一个音频项目（令牌、URL、播放命令），Alexa 就会开始播放它。

我应该提到我没有使用 API.AI，而只是使用 Actions SDK 并使用 C# 在 Asure 上托管我的 Web 服务。

所以，底线...如何通过 Actions SDK 格式化响应以将 MP3 文件流式传输到 Google Home？

score 6 · Accepted Answer

更新：第一个答案仅适用于 Dialogflow 的 V1。至于 V2，您可以通过这种方式创建 mediaResponse（来自 Google 的文档）：

conv.ask(new MediaObject({
  name: 'Jazz in Paris',
  url: 'http://storage.googleapis.com/automotive-media/Jazz_In_Paris.mp3',
  description: 'A funky Jazz tune',
  icon: new Image({
    url: 'http://storage.googleapis.com/automotive-media/album_art.jpg',
    alt: 'Media icon',
  }),
}));

==================================================== =======================

我在这里发布了一个答案。

基本上你可以创建一个 mediaResponse 对象来播放你的音频文件。我可以播放 50 分钟的音频文件就好了。

Node.js 中的代码示例可能是（使用当前文档）：

const richResponse = app.buildRichResponse()
 .addSimpleResponse("Here's song one.")
  .addMediaResponse(app.buildMediaResponse()
  .addMediaObjects([
    app.buildMediaObject("Song One", "https://....mp3")
      .setDescription("Song One with description and large image.") // Optional
      .setImage("https://....jpg", app.Media.ImageType.LARGE)
        // Optional. Use app.Media.ImageType.ICON if displaying icon.
  ])
)
.addSuggestions(["other songs"]);

score 5 · Accepted Answer

根据文档，您可以在 SSML 中嵌入元素。https://developers.google.com/actions/reference/ssml包括以下示例：

<speak>
  Here are <say-as interpet-as="characters">SSML</say-as> samples.
  I can pause <break time="3s"/>.
  I can play a sound
  <audio src="https://www.example.com/MY_MP3_FILE.mp3">didn't get your MP3 audio file</audio>.
  I can speak in cardinals. Your number is <say-as interpret-as="cardinal">10</say-as>.
  Or I can speak in ordinals. You are <say-as interpret-as="ordinal">10</say-as> in line.
  Or I can even speak in digits. The digits for ten are <say-as interpret-as="characters">10</say-as>.
  I can also substitute phrases, like the <sub alias="World Wide Web Consortium">W3C</sub>.
  Finally, I can speak a paragraph with two sentences.
  <p><s>This is sentence one.</s><s>This is sentence two.</s></p>
</speak>

编辑

p/s：文档中的 SSML有以下限制：

单声道是首选，但立体声是可以接受的。
最长持续时间 120 秒。如果您想播放持续时间更长的音频，请考虑实施媒体响应。5 MB 文件大小限制。
源 URL 必须使用 HTTPS 协议。
获取音频时我们的 UserAgent 是“Google-Speech-Actions”。

actions-on-google - 如何让 Actions on Google 流式传输音频？

2 回答 2

Related

Reference