我的 Solr 索引中有下一个内容:
west indian cherry
in 字段类型text_en
(字段定义见下文)。
当我cherr*
找到匹配项时。
还要在文档中搜索cherri*
匹配词。但
搜索cherry*
不匹配。
我对此表示怀疑PorterStemFilterFactory
,但我不明白为什么(查询分析器与索引分析器相同)。
样本查询
/solr/select?defType=edismax&q=cherry*
solrconfig.xml
...
<fieldType name="text_en" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPossessiveFilterFactory"/>
<filter class="solr.PorterStemFilterFactory"/>
</analyzer>
</fieldType>
...
现场分析
指数
org.apache.solr.analysis.StandardTokenizerFactory: cherry
org.apache.solr.analysis.LowerCaseFilterFactory: cherry
org.apache.solr.analysis.EnglishPossessiveFilterFactory: cherry
org.apache.solr.analysis.PorterStemFilterFactory: cherri <-- note the change from cherry to cherri
询问
org.apache.solr.analysis.StandardTokenizerFactory: cherry
org.apache.solr.analysis.LowerCaseFilterFactory: cherry
org.apache.solr.analysis.EnglishPossessiveFilterFactory: cherry
org.apache.solr.analysis.PorterStemFilterFactory: cherri