0

I have a synonym file, used at index time, that contains this equivalence:

uc, university of california

I then looked at how indexing "uc berkeley" would look on analysis.jsp. I was surprised:

org.apache.solr.analysis.SynonymFilterFactory {synonyms=companysyns.txt, expand=true, ignoreCase=true, luceneMatchVersion=LUCENE_36}
position    1               2               3
term text   university      berkeley        california
            uc              of
type        SYNONYM         word            SYNONYM
            SYNONYM         SYNONYM
startOffset 0               3               3
            0               3
endOffset   2               11              11
            2               11

Note that "berkeley" appears in between "university" and "california". This has meant that, when I search for "university of california berkeley", I don't get a match. But "university berkeley california" works!

How can I make sure "university of california berkeley" works properly?

Thanks!

4

2 回答 2

0

我面临着类似的问题,突出显示的响应突出显示了错误的单词。我正在使用 solr 3.6

在我的用例中,在索引端配置了同义词,expand=true。

例如如果我在 synonyms.txt、dns、域名系统中有以下内容

我索引了类似“一个有效的示例 dns 条目”之类的东西。当我在突出显示的响应中搜索“名称”(不带引号)时,我得到“一个有效的示例 dns条目”。如您所见,词项已突出显示。
此外,搜索“system”会导致“A sample dns entry that works”

于 2012-04-19T18:03:00.993 回答
0

看起来像一个已知问题。提到了一个修复(将 luceneMatchVersion 设置为 LUCENE_33)。不确定它是否适合你。让我们希望他们尽快修复它。

于 2012-04-27T17:09:41.100 回答