1

我有一个我设置的 Elasticsearch 索引"max_ngram_diff": 50,但不知何故,它似​​乎只适用于分edge_ngram词器,但不适用于分ngram词器。

我已经针对同一个 URL 提出了这两个请求http://localhost:9201/index-name/_analyze

请求 1

{
    "tokenizer":
    {
        "type": "edge_ngram",
        "min_gram": 3,
        "max_gram": 20,
        "token_chars": [
            "letter",
            "digit"
        ]
    },
    "text": "1234567890;abcdefghijklmn;"
}

请求 2

{
    "tokenizer": {
        "type": "ngram",
        "min_gram": 3,
        "max_gram": 20,
        "token_chars": [
            "letter",
            "digit"
        ]
    },
    "text": "1234567890;abcdefghijklmn;"
}

第一个请求返回预期结果:

{
    "tokens": [
        {
            "token": "123",
            "start_offset": 0,
            "end_offset": 3,
            "type": "word",
            "position": 0
        },
        {
            "token": "1234",
            "start_offset": 0,
            "end_offset": 4,
            "type": "word",
            "position": 1
        },
        {
            "token": "12345",
            "start_offset": 0,
            "end_offset": 5,
            "type": "word",
            "position": 2
        },
        {
            "token": "123456",
            "start_offset": 0,
            "end_offset": 6,
            "type": "word",
            "position": 3
        }, 
        // more tokens
    ]
}

但是第二个请求只返回这个:

{
    "error": {
        "root_cause": [
            {
                "type": "remote_transport_exception",
                "reason": "[ffe18f1a89e6][172.18.0.3:9300][indices:admin/analyze[s]]"
            }
        ],
        "type": "illegal_argument_exception",
        "reason": "The difference between max_gram and min_gram in NGram Tokenizer must be less than or equal to: [1] but was [17]. This limit can be set by changing the [index.max_ngram_diff] index level setting."
    },
    "status": 400
}

发生了什么,带有标记器的第一个请求可以在和之间有更大edge_ngram的差异,但带有标记器的第二个请求不能?max_grammin_gram1ngram

这是我的映射:

{
    "settings": {
        "index": {
            "max_ngram_diff": 50,
            // further settings
         }
     }
}

使用的 Elastisearch 版本是7.2.0

谢谢你的帮助!

4

1 回答 1

1

此行为与 ES 版本 7.2.0 有关。使用 ES 版本 7.4.0 时一切正常。

于 2020-02-27T09:35:23.417 回答