3

我正在尝试使用全名匹配和部分名称匹配为我的 elasticsearch 实例设置映射:

curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'  -d '{
  "mappings": {
    "venue": {
      "properties": {
        "location": {
          "type": "geo_point"
        },
        "name": {
          "fields": {
            "name": {
              "type": "string",
              "analyzer": "full_name"
            },
            "partial": {
              "search_analyzer": "full_name",
              "index_analyzer": "partial_name",
              "type": "string"
            }
          },
          "type": "multi_field"
        }
      }
    }
  },
  "settings": {
    "analysis": {
      "filter": {
        "swedish_snow": {
          "type": "snowball",
          "language": "Swedish"
        },
        "name_synonyms": {
          "type": "synonym",
          "synonyms_path": "name_synonyms.txt"
        },
        "name_ngrams": {
          "side": "front",
          "min_gram": 2,
          "max_gram": 50,
          "type": "edgeNGram"
        }
      },
      "analyzer": {
        "full_name": {
          "filter": [
            "standard",
            "lowercase"
          ],
          "type": "custom",
          "tokenizer": "standard"
        },
        "partial_name": {
          "filter": [
            "swedish_snow",
            "lowercase",
            "name_synonyms",
            "name_ngrams",
            "standard"
          ],
          "type": "custom",
          "tokenizer": "standard"
        }
      }
    }
  }
}'

我用一些数据填充它:

curl -XPOST 'http://127.0.0.1:9200/_bulk?pretty=1'  -d '
{"index" : {"_index" : "test", "_type" : "venue"}}
{"location" : [59.3366, 18.0315], "name" : "johnssons"}
{"index" : {"_index" : "test", "_type" : "venue"}}
{"location" : [59.3366, 18.0315], "name" : "johnsson"}
{"index" : {"_index" : "test", "_type" : "venue"}}
{"location" : [59.3366, 18.0315], "name" : "jöhnsson"}
'

执行一些搜索来测试, 全名:

curl -XGET 'http://127.0.0.1:9200/test/venue/_search?pretty=1' -d '{
  "query": {
    "bool": {
      "should": [
        {
          "text": {
            "name": {
              "boost": 1,
              "query": "johnsson"
            }
          }
        },
        {
          "text": {
            "name.partial": "johnsson"
          }
        }
      ]
    }
  }
}'

结果:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.29834434,
    "hits": [
      {
        "_index": "test",
        "_type": "venue",
        "_id": "CAO-dDr2TFOuCM4pFfNDSw",
        "_score": 0.29834434,
        "_source": {
          "location": [
            59.3366,
            18.0315
          ],
          "name": "johnsson"
        }
      },
      {
        "_index": "test",
        "_type": "venue",
        "_id": "UQWGn8L9Squ5RYDMd4jqKA",
        "_score": 0.14663845,
        "_source": {
          "location": [
            59.3366,
            18.0315
          ],
          "name": "johnssons"
        }
      }
    ]
  }
}

部分名称:

curl -XGET 'http://127.0.0.1:9200/test/venue/_search?pretty=1' -d '{
  "query": {
    "bool": {
      "should": [
        {
          "text": {
            "name": {
              "boost": 1,
              "query": "johns"
            }
          }
        },
        {
          "text": {
            "name.partial": "johns"
          }
        }
      ]
    }
  }
}'

结果:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.14663845,
    "hits": [
      {
        "_index": "test",
        "_type": "venue",
        "_id": "UQWGn8L9Squ5RYDMd4jqKA",
        "_score": 0.14663845,
        "_source": {
          "location": [
            59.3366,
            18.0315
          ],
          "name": "johnssons"
        }
      },
      {
        "_index": "test",
        "_type": "venue",
        "_id": "CAO-dDr2TFOuCM4pFfNDSw",
        "_score": 0.016878016,
        "_source": {
          "location": [
            59.3366,
            18.0315
          ],
          "name": "johnsson"
        }
      }
    ]
  }
}

名字中的名字:

curl -XGET 'http://127.0.0.1:9200/test/venue/_search?pretty=1' -d '{
  "query": {
    "bool": {
      "should": [
        {
          "text": {
            "ame": {
              "boost": 1,
              "query": "johnssons"
            }
          }
        },
        {
          "text": {
            "name.partial": "johnssons"
          }
        }
      ]
    }
  }
}'

结果:

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.39103588,
    "hits": [
      {
        "_index": "test",
        "_type": "venue",
        "_id": "UQWGn8L9Squ5RYDMd4jqKA",
        "_score": 0.39103588,
        "_source": {
          "location": [
            59.3366,
            18.0315
          ],
          "name": "johnssons"
        }
      }
    ]
  }
}

正如你所看到的,我只得到了一个场地,那就是johnssons。我不应该同时得到johnssonsjohnsson返回吗?我在设置中做错了什么?

4

1 回答 1

2

您正在使用full_name分析作为该name.partial字段的搜索分析器。结果,您的查询被翻译为 term 的查询,该查询johnssons不匹配任何内容。

您可以使用分析 API查看记录的索引方式。例如,这个命令

curl -XGET 'http://127.0.0.1:9200/test/_analyze?analyzer=partial_name&pretty=1' -d 'johnssons'

将向您展示在索引字符串“johnssons”期间将被翻译成以下术语:“jo”、“joh”、“john”、“johns”、“johnss”、“johnsso”、“johnsson”。虽然这个命令

 curl -XGET 'http://127.0.0.1:9200/test/_analyze?analyzer=full_name&pretty=1' -d 'johnssons'

将向您展示在搜索字符串“johnssons”期间被翻译成术语“johnssons”。如您所见,您的搜索词与此处的数据不匹配。

于 2012-12-07T12:01:47.690 回答