我正在为俄语使用 solr 拼写检查。当您使用西里尔字符键入时,一切正常,但是当您使用拉丁字符键入时,它就不起作用了。
我希望拼写检查正确,以及何时使用西里尔字符键入以及何时使用拉丁字符键入。并用西里尔字符纠正文本。
For example, when you type:
телевидениеее or televidenieee
It should correct to:
телевидение
架构.xml:
<fieldType name="spell_text" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[,.;:]" replacement=" "/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.PatternReplaceFilterFactory" pattern="'s" replacement=""/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="2" outputUnigrams="true"/>
<filter class="solr.LengthFilterFactory" min="3" max="256" />
</analyzer>
</fieldType>
solrconfig.xml
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">spellcheck</str>
<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="buildOnCommit">true</str>
<str name="buildOnOptimize">true</str>
<str name="spellcheckIndexDir">./spellchecker</str>
<str name="accuracy">0.75</str>
</lst>
<lst name="spellchecker">
<str name="name">wordbreak</str>
<str name="field">spellcheck</str>
<str name="classname">solr.WordBreakSolrSpellChecker</str>
<str name="combineWords">false</str>
<str name="breakWords">true</str>
<int name="maxChanges">1</int>
</lst>
</searchComponent>
感谢帮助