1

我在理解 Elasticsearch 的实现中如何计算权重时遇到了一些麻烦。据我了解,除非您使用 Dismax,否则文档的分数是所有权重的总和,而不是最大字段的分数。其次,计算与文档完全不同。

查看下面我提供的查询和解释,我有三个问题:

  1. 同一文档的名称和描述的 doc_count 如何不同?
  2. 为什么要根据最大字段权重进行评分?
  3. 如果我在整个索引中只有 5 个包含搜索词的文档,为什么文档频率为 6。

提前致谢。

询问

GET localhost_document/_search?explain=1&pretty=1&search_type=dfs_query_then_fetch

        {
          "query": {
            "multi_match" : {
              "query":    "lhc", 
              "fields": [  "Metadata.Name", "Metadata.Description^5" ] 
            }
          }
        }

解释

"_explanation": {
               "value": 28.427635,
               "description": "max of:",
               "details": [
                  {
                     "value": 28.427635,
                     "description": "weight(Metadata.Description:lhc in 0) [PerFieldSimilarity], result of:",
                     "details": [
                        {
                           "value": 28.427635,
                           "description": "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
                           "details": [
                              {
                                 "value": 5,
                                 "description": "boost",
                                 "details": []
                              },
                              {
                                 "value": 5.3759904,
                                 "description": "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
                                 "details": [
                                    {
                                       "value": 6,
                                       "description": "docFreq",
                                       "details": []
                                    },
                                    {
                                       "value": 1404,
                                       "description": "docCount",
                                       "details": []
                                    }
                                 ]
                              },
                              {
                                 "value": 1.0575776,
                                 "description": "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
                                 "details": [
                                    {
                                       "value": 1,
                                       "description": "termFreq=1.0",
                                       "details": []
                                    },
                                    {
                                       "value": 1.2,
                                       "description": "parameter k1",
                                       "details": []
                                    },
                                    {
                                       "value": 0.75,
                                       "description": "parameter b",
                                       "details": []
                                    },
                                    {
                                       "value": 2.9529915,
                                       "description": "avgFieldLength",
                                       "details": []
                                    },
                                    {
                                       "value": 2.56,
                                       "description": "fieldLength",
                                       "details": []
                                    }
                                 ]
                              }
                           ]
                        }
                     ]
                  },
                  {
                     "value": 4.2207813,
                     "description": "weight(Metadata.Name:lhc in 0) [PerFieldSimilarity], result of:",
                     "details": [
                        {
                           "value": 4.2207813,
                           "description": "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
                           "details": [
                              {
                                 "value": 5.7578497,
                                 "description": "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
                                 "details": [
                                    {
                                       "value": 16,
                                       "description": "docFreq",
                                       "details": []
                                    },
                                    {
                                       "value": 5224,
                                       "description": "docCount",
                                       "details": []
                                    }
                                 ]
                              },
                              {
                                 "value": 0.7330482,
                                 "description": "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
                                 "details": [
                                    {
                                       "value": 1,
                                       "description": "termFreq=1.0",
                                       "details": []
                                    },
                                    {
                                       "value": 1.2,
                                       "description": "parameter k1",
                                       "details": []
                                    },
                                    {
                                       "value": 0.75,
                                       "description": "parameter b",
                                       "details": []
                                    },
                                    {
                                       "value": 2.1161945,
                                       "description": "avgFieldLength",
                                       "details": []
                                    },
                                    {
                                       "value": 4,
                                       "description": "fieldLength",
                                       "details": []
                                    }
                                 ]
                              }
                           ]
                        }
                     ]
                  }
               ]
            }
         }
4

0 回答 0