基于SOLR-1938,如果您在 WordDelimiterFilter 之前有 ElisionFilter ,则l'avion
只会生成一个 token avion
。但是如果 ElisionFilter 不存在,那么根据您的 WordDelimiterFilter 的设置,它可能会生成超过 1 个标记,例如
l, avion, lavion
由于avion
它是由 WordDelimiterFilter 生成的,因此您认为它好像 ElisionFilter 已经包含在其中。
我猜关于慢速短语查询的评论意味着如果l'avion
被搜索,那么如果 ElisionFilter 不存在,它将搜索多个标记。
更新:这篇文章指出了这个问题:http ://www.hathitrust.org/blogs/large-scale-search/tuning-search-performance它说What we discovered is that the word “l’art” was being searched as a phrase query “l art”. Phrase queries are much slower than Boolean queries because the search engine has to read the positions index for the words in the phrase into memory and because there is more processing involved.
所以我猜问题是用双引号搜索"l'avion"