在处理用户的一个查询时,最初我认为他使用的是最新版本,而当他显示分析 API时,这令人惊讶。
需要检查令牌的自定义分析器
{
"settings": {
"analysis": {
"filter": {
"splcharfilter": {
"type": "pattern_capture",
"preserve_original": true,
"patterns": [
"([?/])"
]
}
},
"analyzer": {
"splcharanalyzer": {
"tokenizer": "keyword",
"filter": [
"splcharfilter",
"lowercase"
]
}
}
}
}
}
分析 API
POST /_analyze
{
"analyzer": "splcharanalyzer",
"text" : "And/or"
}
Output
{
"tokens": [
{
"token": "analyzer", --> why this token
"start_offset": 7,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "splcharanalyzer", --> why this token
"start_offset": 19,
"end_offset": 34,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "text", --> why this token
"start_offset": 42,
"end_offset": 46,
"type": "<ALPHANUM>",
"position": 3
},
{
"token": "and",
"start_offset": 51,
"end_offset": 54,
"type": "<ALPHANUM>",
"position": 4
},
{
"token": "or",
"start_offset": 58,
"end_offset": 60,
"type": "<ALPHANUM>",
"position": 5
}
]
}
正如上面清楚显示的那样,它生成了这么多不正确的令牌,当检查时用户提到他使用的是 1.7 版本并遵循最新版本的 elasticsearch 中提供的语法。