26

我一直在测试 Google 的 Vision API 以将标签附加到不同的图像。

对于给定的图像,我会得到这样的结果:

"google_labels": {
            "responses": [{
                "labelAnnotations": [{
                    "score": 0.8966763,
                    "description": "food",
                    "mid": "/m/02wbm"
                }, {
                    "score": 0.80512983,
                    "description": "produce",
                    "mid": "/m/036qh8"
                }, {
                    "score": 0.73635191,
                    "description": "juice",
                    "mid": "/m/01z1kdw"
                }, {
                    "score": 0.69849229,
                    "description": "meal",
                    "mid": "/m/0krfg"
                }, {
                    "score": 0.53875387,
                    "description": "fruit",
                    "mid": "/m/02xwb"
                }]
            }]
        }

--> 我的问题是:

  1. 有人知道 Google 是否发布了他们的完整标签列表 ( ['produce', 'meal', ...]) 以及我在哪里可以找到它?
  2. 这些标签有什么结构吗?- 例如,是否已知“食物”是“农产品”的超集。

我猜是“不”和“不”,因为我找不到任何东西,但是,也许不是。谢谢!

4

2 回答 2

10

虽然我无法验证数据库的完整性,但Google Open Images项目有一个大约 20k 分类的列表。

如果您浏览到下载页面,您可以下载带有CSV描述的列表。

我检查了 CloudVision 中的一些参考图像,结果如下:

ID / CloudVision Classification / OpenImages Classification
1. 01ssh5 / Shoulder / Shoulder (Body Part)
2. 09cx8 / Finger / Finger
3. 068jd / Photograph / Photograph
4. 01k74n / Facial expression / Facial expression
5. 04hgtk / Head / Human Head

我能够在 CSV 中找到所有具有相同含义的 ID - 因此作为基本列表,这应该足够了。请注意,您应该始终按 ID 进行匹配,而不是按分类进行匹配,因为有一些细微的变化。

如果您在 CloudVision 中找到任何 ID 但不在列表中,我很想在评论中知道!

于 2019-08-25T19:31:43.833 回答
4

There is an API to search them called Google Knowledge Graph API:

https://developers.google.com/knowledge-graph/reference/rest/v1/

They link it at the bottom of Google Vision API Documentation:

https://cloud.google.com/vision/docs/labels


Edit: more info

Ok, mids starting with /g/ are google entities, mids starting with /m/ are Freebase identifiers, but google kgraph API doesn't returns them always.

This data is public and can be downloaded, but there are too many records in the database and Google haven't published which ones of them they use.

Example of MID returned in vision api and the record in Wikidata:

{
    desc: "institution",
    mid: "/m/01r28c",
    score: 72.29216694831848,
    confidence: 0,
    locations: [ ],
    properties: [ ]
},

https://www.wikidata.org/wiki/Q178706


The last freebase dump can be downloaded here:

https://developers.google.com/freebase/

于 2017-07-17T18:26:31.570 回答