0

我正在尝试使用 elasticsearch 的 facets 功能制作单词和短语的标签云。

我的映射:

curl -XPOST http://localhost:9200/myIndex/ -d '{

  ...

  "analysis":{  
    "filter":{ 
      "myCustomShingle":{
        "type":"shingle",
        "max_shingle_size":3,
        "output_unigrams":true
      }
    },
    "analyzer":{ //making a custom analyzer
      "myAnalyzer":{
        "type":"custom",
        "tokenizer":"standard",
        "filter":[
          "lowercase",
          "myCustomShingle",
          "stop"
        ]
      } 
    }
  }

  ...
},
"mappings":{

   ...


   "description":{ //the field to be analyzed for making the tag cloud
     "type":"string",
     "analyzer":"myAnalyzer",
     "null_value" : "null"
   },


   ...



}

生成构面的查询:

curl -X POST "http://localhost:9200/myIndex/myType/_search?&pretty=true" -d '
{
  "size":"0",

  "query": {
    match_all:{}
  },


  "facets": {
    "blah": {
      "terms": {
        "fields" :     ["description"],
        "exclude" : [ 'evil' ], //remove facets that contain these words
        "size": "50"
      }
    }
  }
}

我的问题是,当我在“facets”的“exclude”选项中插入一个单词说“evil”时,它成功地删除了包含匹配“evil”的单词(或单个shingles )的构面。但它不会删除 2/3 单词 shingles,“生化危机”,“邪恶的计算机”,“我的邪恶猫”。如何删除包含“排除词”的短语的方面?

4

1 回答 1

0

It isn't completely clear what you want to achieve. You usually wouldn't make facets on analyzed fields. Maybe you could explain why you're making shingles so that we can help achieving what you want in a better way.

With the exclude facet parameter you can exclude some specific entry, but evil is not the same as resident evil. If you want to exclude it you need to specify it. Facets are made based on indexed terms, and resident evil is in fact a single term in the index, which is not the same as the term evil.

Given the choice that you already made for indexing and faceting, there is a way to achieve what you want. Elasticsearch has a really powerful scripting module. You can use a script to decide whether each entry should be included in the facet or not like this:

{
  "query": {
    "match_all" : {}
  },
  "facets": {
    "tags": {
      "terms": {
        "field" : "tags",
        "script" : "term.contains('evil') ? true : false"
      }
    }
  }
}
于 2012-10-08T10:25:50.473 回答