2

What is the simplest way to query Solr for the documents that contain text similiar to a (longish) passage. This is similar to what ElasticSearch match queries do or what probabilistic search engines like Indri do by default. This is something between an and and an or query. None of the terms is required, but you get documents that contain many of the terms. You can also just pass a passage of raw text to the engine and it returns documents with high term overlap with the passage without having to try to parse or tokenize the text in the client. The best I option can see in the Solr query reference is to tokenize the query text myself and then insert an OR between each pair of terms and return the top N results. Is there more concise way of doing it with Solr?

4

1 回答 1

3

上面的答案是正确的。您可以选择在索引中查找与另一个文档相似、与给定外部 URL 相似或与某些给定文本相似的文档。您可以选择要定位的字段和各种其他参数。这是 MLT 的官方 Solr 参考指南文档页面:https ://cwiki.apache.org/confluence/display/solr/MoreLikeThis

于 2013-10-08T20:51:54.440 回答