我在理解 Elasticsearch 的实现中如何计算权重时遇到了一些麻烦。据我了解,除非您使用 Dismax,否则文档的分数是所有权重的总和,而不是最大字段的分数。其次,计算与文档完全不同。
查看下面我提供的查询和解释,我有三个问题:
- 同一文档的名称和描述的 doc_count 如何不同?
- 为什么要根据最大字段权重进行评分?
- 如果我在整个索引中只有 5 个包含搜索词的文档,为什么文档频率为 6。
提前致谢。
询问
GET localhost_document/_search?explain=1&pretty=1&search_type=dfs_query_then_fetch
{
"query": {
"multi_match" : {
"query": "lhc",
"fields": [ "Metadata.Name", "Metadata.Description^5" ]
}
}
}
解释
"_explanation": {
"value": 28.427635,
"description": "max of:",
"details": [
{
"value": 28.427635,
"description": "weight(Metadata.Description:lhc in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 28.427635,
"description": "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
"details": [
{
"value": 5,
"description": "boost",
"details": []
},
{
"value": 5.3759904,
"description": "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
"details": [
{
"value": 6,
"description": "docFreq",
"details": []
},
{
"value": 1404,
"description": "docCount",
"details": []
}
]
},
{
"value": 1.0575776,
"description": "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details": [
{
"value": 1,
"description": "termFreq=1.0",
"details": []
},
{
"value": 1.2,
"description": "parameter k1",
"details": []
},
{
"value": 0.75,
"description": "parameter b",
"details": []
},
{
"value": 2.9529915,
"description": "avgFieldLength",
"details": []
},
{
"value": 2.56,
"description": "fieldLength",
"details": []
}
]
}
]
}
]
},
{
"value": 4.2207813,
"description": "weight(Metadata.Name:lhc in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 4.2207813,
"description": "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
"details": [
{
"value": 5.7578497,
"description": "idf, computed as log(1 + (docCount - docFreq + 0.5) / (docFreq + 0.5)) from:",
"details": [
{
"value": 16,
"description": "docFreq",
"details": []
},
{
"value": 5224,
"description": "docCount",
"details": []
}
]
},
{
"value": 0.7330482,
"description": "tfNorm, computed as (freq * (k1 + 1)) / (freq + k1 * (1 - b + b * fieldLength / avgFieldLength)) from:",
"details": [
{
"value": 1,
"description": "termFreq=1.0",
"details": []
},
{
"value": 1.2,
"description": "parameter k1",
"details": []
},
{
"value": 0.75,
"description": "parameter b",
"details": []
},
{
"value": 2.1161945,
"description": "avgFieldLength",
"details": []
},
{
"value": 4,
"description": "fieldLength",
"details": []
}
]
}
]
}
]
}
]
}
}