我有一个简单的 IndexSearcher、QueryParser、SimpleAnalyzer 设置。运行一些查询,我发现具有多个术语的查询返回的 ScoreDoc[i].score 与解释查询语句中显示的不同。显然它是解释中显示的分数除以搜索词的数量。对这种行为有什么解释吗?
Running search(TERM1 TERM2 TERM3)
line:term1 line:term2 line:term3
2.167882 = sum of:
0.6812867 = weight(line:term1 in 6594) [DefaultSimilarity], result of:
0.6812867 = score(doc=6594,freq=2.0), product of:
0.5389907 = queryWeigh
totalHits 1
1678413725, TERM1 TERM2 TERM3, score: 0.72262734
我了解 coord() 语句将用于惩罚仅包含所提供搜索词子集的文档。但是,本文档包含所有条款。有什么建议么?
编辑:似乎只有当查询配置为使用 OR 语句而不是 AND 时才会发生除法。因此,使用 OR 查询并匹配所有术语仍然除以搜索查询中的术语数。我在文档中找不到这部分,但至少它解释了差异。
然而,应用 QueryWrapperFilter 似乎再次改变了评分。尽管根据文档,它应该只过滤结果而不影响评分。
更多细节
这两个分数是同一查询的结果。只有第二个查询被划分
0.114700586 = product of:
0.34410176 = sum of:
0.34410176 = weight(line:term1 in 24) [DefaultSimilarity], result of:
0.34410176 = score(doc=24,freq=1.0), product of:
0.5389907 = queryWeight, product of:
8.17176 = idf(docFreq=14, maxDocs=19532)
0.065957725 = queryNorm
0.63841873 = fieldWeight in 24, product of:
1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1.0
8.17176 = idf(docFreq=14, maxDocs=19532)
0.078125 = fieldNorm(doc=24)
0.33333334 = coord(1/3)
item_id: 1495958818, item_name: term 1 dolor sit met, score: 0.114700586
0.18352094 = product of:
0.5505628 = sum of:
0.5505628 = weight(line:term 1 in 6112) [DefaultSimilarity], result of:
0.5505628 = score(doc=6112,freq=1.0), product of:
0.5389907 = queryWeight, product of:
8.17176 = idf(docFreq=14, maxDocs=19532)
0.065957725 = queryNorm
1.02147 = fieldWeight in 6112, product of:
1.0 = tf(freq=1.0), with freq of:
1.0 = termFreq=1.0
8.17176 = idf(docFreq=14, maxDocs=19532)
0.125 = fieldNorm(doc=6112)
0.33333334 = coord(1/3)
item_id: 1677761523, item_name: some text term 1, score: 0.061173648