solr - Solr - 无法查询特殊字符或数字

Question

在 solr 字段 PackageTag

<field name="PackageTag" type="text_en_splitting" indexed="true" stored="true" required="false" multiValued="true"/>

我有以下价值

"playing @@*"

现在我正在寻找“玩”，我在我的结果中得到了它。
但是当我用@@* 搜索时，我没有。在单词分隔符中省略。

有没有办法让用户搜索其特殊字符但仍使用单词分隔？

score 1 · Accepted Answer

There are twoissues here:

first off, you must create your own fieldType in Solr and configure it to NOT user "@" and "*" as stopWords:

in schema.xml do something like this:

<types>
        <fieldType name="myTextFieldType" class="solr.TextField" positionIncrementGap="100">
            <analyzer type="index">
                <tokenizer class="solr.StandardTokenizerFactory" />
                <filter class="solr.StopFilterFactory" ignoreCase="true"
                    words="stopwords.txt" enablePositionIncrements="true" />
            </analyzer>
            <analyzer type="query">
                <tokenizer class="solr.StandardTokenizerFactory" />             
                <filter class="solr.StopFilterFactory" ignoreCase="true"
                    words="stopwords.txt" enablePositionIncrements="true" />
            </analyzer>
        </fieldType>
        </types>

You must then use that fieldType for the "PackageTag" field:

<field name="PackageTag" type="text_en_splitting"

Then, in the "conf" dir (the same dir where schema.xml is located), create or edit the stopwords.txt file and add "@" and "*" to it. Just put them in there, each character on one line:

@

*

Now, since the "*" character is also a special character for Lucene queries (wildcard), you need to escape it in your queries. You can escape "*" by replacing it with "\*". Something like this:

PackageTag:bla\*

to search for fields containing "bla*".

score 0 · Accepted Answer

我不记得 Lucene 特殊字符的列表，但是您是否尝试\在字符前使用（反斜杠）转义？

如果这不起作用，你可能想看看Analyzer你用来索引你的字段的。StandardAnalyzer可能会对你的特殊字符做一些有趣的事情，所以你可以考虑使用另一个分析器或自己动手。

score 0 · Accepted Answer

您必须在 protwords.txt 文件中添加单词分隔符，然后应用在索引和查询时使用原始词的过滤器。（例如solr.WordDelimiterFilterFactory使用protected="protwords.txt"参数）。

通过这种方式，它们将根据您的需要进行标记，并且不会在查询期间被删除。

solr - Solr - 无法查询特殊字符或数字

3 回答 3

Related

Reference