0

Is there a way with Lucene 4.4 to determine exactly which terms satisfied a query? I need to highlight only terms that caused the document to be returned, not the same term elsewhere in the document. For example, given the document:

We are going to visit the White House today. I hear it is painted white.

and the phrase query "white house", I want to highlight these terms:

We are going to visit the <b>White</b> <b>House</b> today. I hear it is painted white.

I've been using PostingsHighlighter, but it will highlight the word "white" in the second sentence as well. I don't want that because the single term "white" does not satisfy the phrase query.

It looks like the only information that comes back from a search are the document IDs and scores. I don't really care about scores for the purpose of relevancy ranking, because I'll be working with all of the documents returned. Is there something I could do with custom scoring that would preserve the information I need? Or is there a better approach that I'm missing?

4

1 回答 1

1

这似乎是PostingsHighlighter(请参阅此讨论)的预期行为。您可以考虑使用Highlighter, 代替。

于 2013-10-04T16:14:34.517 回答