0

我有一个简单的 IndexSearcher、QueryParser、SimpleAnalyzer 设置。运行一些查询,我发现具有多个术语的查询返回的 ScoreDoc[i].score 与解释查询语句中显示的不同。显然它是解释中显示的分数除以搜索词的数量。对这种行为有什么解释吗?

Running search(TERM1 TERM2 TERM3)
line:term1 line:term2 line:term3
2.167882 = sum of:
  0.6812867 = weight(line:term1 in 6594) [DefaultSimilarity], result of:
    0.6812867 = score(doc=6594,freq=2.0), product of:
      0.5389907 = queryWeigh

totalHits 1
1678413725, TERM1 TERM2 TERM3, score: 0.72262734

我了解 coord() 语句将用于惩罚仅包含所提供搜索词子集的文档。但是,本文档包含所有条款。有什么建议么?


编辑:似乎只有当查询配置为使用 OR 语句而不是 AND 时才会发生除法。因此,使用 OR 查询并匹配所有术语仍然除以搜索查询中的术语数。我在文档中找不到这部分,但至少它解释了差异。

然而,应用 QueryWrapperFilter 似乎再次改变了评分。尽管根据文档,它应该只过滤结果而不影响评分。


更多细节

这两个分数是同一查询的结果。只有第二个查询被划分

0.114700586 = product of:
  0.34410176 = sum of:
    0.34410176 = weight(line:term1 in 24) [DefaultSimilarity], result of:
      0.34410176 = score(doc=24,freq=1.0), product of:
        0.5389907 = queryWeight, product of:
          8.17176 = idf(docFreq=14, maxDocs=19532)
          0.065957725 = queryNorm
        0.63841873 = fieldWeight in 24, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.17176 = idf(docFreq=14, maxDocs=19532)
          0.078125 = fieldNorm(doc=24)
  0.33333334 = coord(1/3)

item_id: 1495958818, item_name: term 1 dolor sit met, score: 0.114700586


0.18352094 = product of:
  0.5505628 = sum of:
    0.5505628 = weight(line:term 1 in 6112) [DefaultSimilarity], result of:
      0.5505628 = score(doc=6112,freq=1.0), product of:
        0.5389907 = queryWeight, product of:
          8.17176 = idf(docFreq=14, maxDocs=19532)
          0.065957725 = queryNorm
        1.02147 = fieldWeight in 6112, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          8.17176 = idf(docFreq=14, maxDocs=19532)
          0.125 = fieldNorm(doc=6112)
  0.33333334 = coord(1/3)

item_id: 1677761523, item_name: some text term 1, score: 0.061173648
4

0 回答 0