在探索在 Solr 中索引维基百科数据的示例 时,我们如何才能获得预期的结果(即与导入的数据相同)?
有没有什么过程可以通过配置而不是组查询来实现,因为我有很多内部标签的数据。
我探索了 xslt 结果转换,但我正在寻找 json 响应。
进口文件:
<page>
<title>AccessibleComputing</title>
<ns>0</ns>
<id>10</id>
<redirect title="Computer accessibility" />
<revision>
<id>381202555</id>
<parentid>381200179</parentid>
<timestamp>2010-08-26T22:38:36Z</timestamp>
<contributor>
<username>OlEnglish</username>
<id>7181920</id>
</contributor>
</revision>
</page>
solrConfig.xml:
<dataConfig>
<dataSource type="FileDataSource" encoding="UTF-8" />
<document>
<entity name="page"
processor="XPathEntityProcessor"
stream="true"
forEach="/mediawiki/page/"
url="data/enwiki-20130102-pages-articles.xml"
transformer="RegexTransformer,DateFormatTransformer"
>
<field column="id" xpath="/mediawiki/page/id" />
<field column="title" xpath="/mediawiki/page/title" />
<field column="revision" xpath="/mediawiki/page/revision/id" />
<field column="user" xpath="/mediawiki/page/revision/contributor/username" />
<field column="userId" xpath="/mediawiki/page/revision/contributor/id" />
<field column="text" xpath="/mediawiki/page/revision/text" />
<field column="timestamp" xpath="/mediawiki/page/revision/timestamp" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'" />
<field column="$skipDoc" regex="^#REDIRECT .*" replaceWith="true" sourceColName="text"/>
</entity>
</document>
</dataConfig>
solr 查询的响应:
"response": {
"numFound": 1,
"start": 0,
"docs": [
{
"id": "10",
"timestamp": "2010-08-26T17:08:36Z",
"revision": 381202555,
"titleText": "AccessibleComputing",
"userId": 7181920,
"user": "OlEnglish"
}
]
}
预期反应:
"response": {
"numFound": 1,
"start": 0,
"docs": [
{
"id": "10",
"timestamp": "2010-08-26T17:08:36Z",
"revision": 381202555,
"titleText": "AccessibleComputing",
"contributor": [{
"userId": 7181920,
"user": "OlEnglish"
}]
}
]
}