8

我正在尝试使用弹性搜索通过以下查询对下面的数据执行术语聚合,输出将名称分解为标记(请参见下面的输出)。所以我尝试将 os_name 映射为 multi_field,现在我无法通过它进行查询。是否可以有没有令牌的索引?比如“Fedora Core”?

询问:

GET /temp/example/_search
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name"
       }
     }
  }
}

数据:

...
    {
        "_index": "temp",
        "_type": "example",
        "_id": "3",
        "_score": 1,
        "_source": {
           "title": "system3",
           "os_name": "Fedora Core",
           "os_version": 18
        }
     },
     {
        "_index": "temp",
        "_type": "example",
        "_id": "1",
        "_score": 1,
        "_source": {
           "title": "system1",
           "os_name": "Fedora Core",
           "os_version": 20
        }
     },
     {
        "_index": "temp",
        "_type": "example",
        "_id": "2",
        "_score": 1,
        "_source": {
           "title": "backup",
           "os_name": "Yellow Dog",
           "os_version": 6
        }
     }
...

输出:

       ...
        {
           "key": "core",
           "doc_count": 2
        },
        {
           "key": "fedora",
           "doc_count": 2
        },
        {
           "key": "dog",
           "doc_count": 1
        },
        {
           "key": "yellow",
           "doc_count": 1
        }
       ...

映射:

PUT /temp
{
  "mappings": {
    "example": {
      "properties": {
        "os_name": {
          "type": "string"
        },
        "os_version": {
          "type": "long"
        },
        "title": {
          "type": "string"
        }
      }
    }
  }
}
4

2 回答 2

8

实际上你应该像这样改变你的映射

"os_name": {
  "type": "string",
  "fields": {
     "raw": {
        "type": "string",
        "index": "not_analyzed"
     }
  }
},

并且您的 aggs 应更改为:

GET /temp/example/_search
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name.raw"
       }
     }
  }
}
于 2014-11-13T11:27:33.803 回答
4

一种可行的解决方案是将字段设置为(在属性 "index" 的文档中not_analyzed阅读有关它的更多信息)。

此解决方案根本不会分析输入,这取决于您可能希望设置自定义分析器的要求,例如不拆分单词,而是将它们小写,以获得不区分大小写的结果。

curl -XDELETE localhost:9200/temp
curl -XPUT localhost:9200/temp -d '
{
  "mappings": {
    "example": {
      "properties": {
        "os_name": {
          "type": "string",
          "index" : "not_analyzed"
        },
        "os_version": {
          "type": "long"
        },
        "title": {
          "type": "string"
        }
      }
    }
  }
}'

curl -XPUT localhost:9200/temp/example/1 -d '
{
    "title": "system3",
    "os_name": "Fedora Core",
    "os_version": 18
}'

curl -XPUT localhost:9200/temp/example/2 -d '
{
    "title": "system1",
    "os_name": "Fedora Core",
    "os_version": 20
}'

curl -XPUT localhost:9200/temp/example/3 -d '
{
    "title": "backup",
    "os_name": "Yellow Dog",
    "os_version": 6
}'

curl -XGET localhost:9200/temp/example/_search?pretty=true -d '
{
  "size": 0,
  "aggs": {
     "OS": {
       "terms": {
           "field": "os_name"
       }
     }
  }
}'

输出:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "OS" : {
      "buckets" : [ {
        "key" : "Fedora Core",
        "doc_count" : 2
      }, {
        "key" : "Yellow Dog",
        "doc_count" : 1
      } ]
    }
  }
}
于 2014-05-21T08:20:58.917 回答