0

在 solr 字段 PackageTag

<field name="PackageTag" type="text_en_splitting" indexed="true" stored="true" required="false" multiValued="true"/>

我有以下价值

"playing @@*"

现在我正在寻找“玩”,我在我的结果中得到了它。
但是当我用@@* 搜索时,我没有。在单词分隔符中省略。

有没有办法让用户搜索其特殊字符但仍使用单词分隔?

4

3 回答 3

1

There are twoissues here:

  • first off, you must create your own fieldType in Solr and configure it to NOT user "@" and "*" as stopWords:

in schema.xml do something like this:

<types>
        <fieldType name="myTextFieldType" class="solr.TextField" positionIncrementGap="100">
            <analyzer type="index">
                <tokenizer class="solr.StandardTokenizerFactory" />
                <filter class="solr.StopFilterFactory" ignoreCase="true"
                    words="stopwords.txt" enablePositionIncrements="true" />
            </analyzer>
            <analyzer type="query">
                <tokenizer class="solr.StandardTokenizerFactory" />             
                <filter class="solr.StopFilterFactory" ignoreCase="true"
                    words="stopwords.txt" enablePositionIncrements="true" />
            </analyzer>
        </fieldType>
        </types>

You must then use that fieldType for the "PackageTag" field:

<field name="PackageTag" type="text_en_splitting"
  • Then, in the "conf" dir (the same dir where schema.xml is located), create or edit the stopwords.txt file and add "@" and "*" to it. Just put them in there, each character on one line:

    @

    *

Now, since the "*" character is also a special character for Lucene queries (wildcard), you need to escape it in your queries. You can escape "*" by replacing it with "\*". Something like this:

PackageTag:bla\*

to search for fields containing "bla*".

于 2013-10-16T15:08:02.353 回答
0

我不记得 Lucene 特殊字符的列表,但是您是否尝试\在字符前使用(反斜杠)转义?

如果这不起作用,你可能想看看Analyzer你用来索引你的字段的。StandardAnalyzer可能会对你的特殊字符做一些有趣的事情,所以你可以考虑使用另一个分析器或自己动手。

于 2013-10-12T14:01:01.290 回答
0

您必须在 protwords.txt 文件中添加单词分隔符,然后应用在索引和查询时使用原始词的过滤器。(例如solr.WordDelimiterFilterFactory使用protected="protwords.txt"参数)。

通过这种方式,它们将根据您的需要进行标记,并且不会在查询期间被删除。

于 2013-10-14T07:24:52.940 回答