有许多用于扫描图像的机器学习 api,但它们只返回一堆标签。
https://azure.microsoft.com/en-gb/services/cognitive-services/computer-vision/
{ "tags": [ "train", "platform", "station", "building", "indoor", "subway", "track", "walking", "waiting", "pulling", "board", "people", "man", "luggage", "standing", "holding", "large", "woman", "yellow", "suitcase" ], "confidence": 0.833099365 } ] }
有没有将这些组合成句子的api?MS Cognitive Vision 是唯一能产生完整字幕的
"captions": [ { "text": "people waiting at a train station",
谷歌情感分析可以将一个句子分成语法部分,但有没有相反的 api?
https://cloud.google.com/natural-language/docs/basics
INPUT:
"train", "platform", "station", "building", "indoor", "subway", "track", "walking", "waiting", "pulling", "board", "people",
"man", "luggage", "standing", "holding", "large", "woman", "yellow", "suitcase"
OUTPUT:
"people waiting at a train station"