elasticsearch - elasticsearch custom_score 乘法不准确

Question

我插入了一些文档，除了一个名为a.

当查询script的custom_score设置为 just_score时，匹配某些字段的特定查询的结果分数为 0.40464813。对于同一个查询，当然后更改为 (mvel)，其中为script9.908349251612433 ，最终得分变为 4.0619955。_score * aa

现在，如果我通过 Chrome 的 JS 控制台运行这个计算，我得到 4.009394996051871。

4.0619955（弹性搜索）
4.009394996051871（铬）

这是一个很大的差异，并且会产生不正确的结果排序。为什么会这样，有没有办法纠正它？

score 1 · Accepted Answer

如果我使用您提供的数字运行一个简单的计算，那么我会得到您期望的结果。

curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1'  -d '
{
   "a" : 9.90834925161243
}
'

curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1'  -d '
{
   "query" : {
      "custom_score" : {
         "script" : "0.40464813 *doc[\u0027a\u0027].value",
         "query" : {
            "match_all" : {}
         }
      }
   }
}
'

# {
#    "hits" : {
#       "hits" : [
#          {
#             "_source" : {
#                "a" : 9.90834925161243
#             },
#             "_score" : 4.009395,
#             "_index" : "test",
#             "_id" : "lPesz0j6RT-Xt76aATcFOw",
#             "_type" : "test"
#          }
#       ],
#       "max_score" : 4.009395,
#       "total" : 1
#    },
#    "timed_out" : false,
#    "_shards" : {
#       "failed" : 0,
#       "successful" : 5,
#       "total" : 5
#    },
#    "took" : 1
# }

我认为您在这里遇到的是跨多个分片测试的数据太少。

默认情况下，按分片计算文档频率。因此，如果您在 shard_1 上有两个相同的文档，在 shard_2 上有一个文档，那么 shard_1 上的文档得分将低于 shard_2 上的文档。

随着更多的数据，文档频率往往会超过分片。但是在测试少量数据时，您要么想创建一个只有一个分片的索引，要么添加search_type=dfs_query_then_fetch到查询字符串参数中。

这会在计算分数之前计算所有相关分片的全局文档频率。

如果您在查询中设置explain为true，那么您可以准确地看到您的分数是如何计算的

elasticsearch - elasticsearch custom_score 乘法不准确

1 回答 1

Related

Reference