0

我有一个索引的映射如下:

{"tagged_index":{"mappings":{"tagged":{"properties":{"tags":{"properties":{"resources":{"properties":{"tagName":{"type":"string"},"type":{"type":"string"}}}}},"content":{"type":"string"}}}}}}

其中Resources是一个可以有多个标签的数组。例如

{"_id":"82906194","_source":{"tags":{"resources":[{"type":"Person","tagName":"Kim_Kardashian",},{"type":"Person","tagName":"Kanye_West",},{"type":"City","tagName":"New_York",},...},"content":" Popular NEWS ..."}} , {"_id":"82906195","_source":{"tags":{"resources":[{"type":"City","tagName":"London",},{"type":"Country","tagName":"USA",},{"type":"Music","tagName":"Hello",},...},"content":" Adele's Hello..."}}, ...

我确实知道如何使用以下查询提取重要术语 [tagName],但我不想要所有类型的术语 [tagName]。如何仅提取例如 Cities only [type:City] 的术语?(我想获得类型为 City 的 tagName 列表,即伦敦、纽约、柏林……)

{"size":0,"query":{"filtered":{"query":{"query_string":{"query":"*","analyze_wildcard":true}}}},"aggs":{"Cities":{"terms":{"field":"tags.resources.tagName","size":10,"order":{"_count":"desc"}}}}}

以下是所需输出的外观:

{"took":1200,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":5179261,"max_score":0.0,"hits":[]},"aggregations":{"Cities":{"doc_count_error_upper_bound":46737,"sum_other_doc_count":36037440,"buckets":[{"key":"London","doc_count":332820},{"key":"New_York","doc_count":211274},{"key":"Berlin","doc_count":156954},{"key":"Amsterdam","doc_count":132173},...

4

1 回答 1

0

你可以试试这个:

{
"_source" : ["tags.resources.tagName"]
 "query": {
    "term": {
       "tags.resources.type": {
          "value": "City"
        }
     }
  }
} 

上面的查询将获取那些类型为 city 的资源,只要资源是object类型的。

编辑后

Tag name属于city类型的问题组。您拥有的当前映射无法实现这一点。您必须将资源字段更改为嵌套类型。

映射看起来像。

 "mappings": {
     "resource": {
        "properties": {
           "tags": {
              "properties": {
                 "content": {
                    "type": "string"
                 },
                 "resources": {
                    "type": "nested",
                    "properties": {
                       "tagName": {
                          "type": "string"
                       },
                       "type": {
                          "type": "string"
                       }
                    }
                 }
              }
           }
        }
     }
  }

最终查询将是:

{
 "size": 0,
 "query": {
  "nested": {
     "path": "tags.resources",
     "query": {
        "match": {
           "tags.resources.type": "city"
         }
      }
    }
  },
  "aggs": {
  "resources Nested path": {
     "nested": {
        "path": "tags.resources"
     },
     "aggs": {
        "city type": {
           "filter": {
              "term": {
                 "tags.resources.type": "city"
              }
           },
           "aggs": {
              "group By tagName": {
                 "terms": {
                    "field": "tags.resources.tagName"
                 }
              }
           }
         }
       }
     }
   }
 }  

输出将是:

"aggregations": {
  "resources Nested path": {
     "doc_count": 6,
     "city type": {
        "doc_count": 2,
        "group By tagName": {
           "doc_count_error_upper_bound": 0,
           "sum_other_doc_count": 0,
           "buckets": [
              {
                 "key": "london",
                 "doc_count": 1
              },
              {
                 "key": "new_york",
                 "doc_count": 1
              }
           ]
         }
       }
     } 
   }
于 2017-05-11T15:43:01.830 回答