我正在尝试使用 solr 的 langid UpdateRequestProcessor。这是配置:
<updateRequestProcessorChain name="languages">
<processor class="solr.LangDetectLanguageIdentifierUpdateProcessorFactory">
<lst name="invariants">
<str name="langid.fl">focus, expertise, platforms, partners, participation, additional</str>
<str name="langid.whitelist">en,fr</str>
<str name="langid.fallback">en</str>
<str name="langid.langField">detectedlang</str>
<bool name="langid.map">true</bool>
<bool name="langid.map.keepOrig">false</bool>
</lst>
</processor>
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
我的字段如下所示:
<fields>
<field name="_root_" type="string" indexed="true" stored="false"/>
<field name="_version_" type="long" indexed="true" stored="true" multiValued="false"/>
<field name="id" type="string" indexed="true" stored="true" required="true" />
<!-- raw fields from sql db -->
<field name="expertise_id" type="int" indexed="true" stored="true" />
<field name="person_id" type="int" indexed="true" stored="true" />
<field name="mod_date" type="date" indexed="true" stored="true" />
<field name="lang" type="string" indexed="true" stored="true" />
<field name="focus" type="text_general" indexed="true" stored="true" />
<field name="expertise" type="text_general" indexed="true" stored="true" />
<field name="platforms" type="text_general" indexed="true" stored="true" />
<field name="partners" type="text_general" indexed="true" stored="true" />
<field name="participation" type="text_general" indexed="true" stored="true" />
<field name="additional" type="text_general" indexed="true" stored="true" />
<field name="tag" type="text_general" termVectors="true" multiValued="true" />
<field name="facet_tag" type="string" stored="false" indexed="false" docValues="true" multiValued="true" default=""/>
<!-- language detected by solr -->
<field name="detectedlang" type="string" indexed="true" stored="true" />
<!-- defined locale fields -->
<dynamicField name="*_en" type="text_en" indexed="true" stored="true" />
<dynamicField name="*_fr" type="text_fr" indexed="true" stored="true" />
<copyField source="tag" target="facet_tag"/>
</fields>
当我运行更新或数据导入时,我知道使用了“语言”更新链,因为focus
它已映射到focus_en
并设置了检测语言。但是,没有映射中的其他字段。langid.fl
为什么?
更新查询示例:
{
"additional": "here is some other information about me.",
"expertise_id": "10000",
"id": "foo_10000",
"focus": "this is my new focus. It is very exciting. When I am done I expect to be super experienced."
}
这是查询的结果expertise_id=10000
。请注意,additional
尚未移至additional_en
:
"response":{"numFound":1,"start":0,"docs":[
{
"additional":"here is some other information about me.",
"expertise_id":10000,
"id":"foo_10000",
"detectedlang":"en",
"focus_en":"this is my new focus. It is very exciting. When I am done I expect to be super experienced.",
"_version_":1447088846110982144}]
}