2

我正在使用 Lucene 3.6,我想在文档的某个字段中获取每个术语的分数,以供将来在搜索时使用。为了存储索引,我创建了这样的文档:

Document doc = new Document();
doc.add(new Field("description", entry.getDescription(), Field.Store.NO, Field.Index.ANALYZED, Field.TermVector.WITH_POSITIONS_OFFSETS));
writer.addDocument(doc);
writer.close(true);

例如,文档有一个“足球”术语:

...
1.623904 = (MATCH) fieldWeight(description:football in 1775), product of:
1.0 = tf(termFreq(description:football )=1)
8.660821 = idf(docFreq=5, maxDocs=12741)
0.1875 = fieldNorm(field=description, doc=1775)
...

我正在使用此代码来获取tfidf

TermFreqVector freqV = indexReader.getTermFreqVector(docId, "description");
for (int j = 0; j < freqV.getTerms().length; j++) {
    String term = freqV.getTerms()[j];
    int freq = freqV.getTermFrequencies()[j];
    float idf = similarity.idfExplain(new Term("descpription", term), searcher).getIdf();
}

但我不明白如何fieldNorm在搜索时获得。有人可以帮忙解答这个问题吗?

谢谢。

4

0 回答 0