我可以将 pdf 文件上传到其中solr
,并且可以搜索这些文件。但是什么是索引solr
?W当我上传一个 pdf 文件时,它将如何进行索引?
这是我用来上传pdf文件的代码
ContentStreamUpdateRequest up
= new ContentStreamUpdateRequest("/update/extract");
up.addFile(fileName);
up.setParam("literal.id", solrId);
up.setParam("literal.first_name", "apachesolr");
up.setParam("literal.last_name", "cookbook");
up.setParam("literal.age", "30");
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
solrServer.request(up);
下面是我的schema.xml
<field name="first_name" type="string" indexed="true" stored="true" required="true"/>
<field name="last_name" type="string" indexed="true" stored="true" required="true"/>
<field name="age" type="int" indexed="true" stored="true" required="true"/>
<field name="created_at" type="date" indexed="true" stored="true"/>
<field name="updated_at" type="date" indexed="true" stored="true"/>
<field name="id" type="string" indexed="true" stored="true" required="true"/>
当我搜索 pdf 中的任何内容时。结果看起来像这样
SolrDocument[{
last_modified=Fri Oct 17 08:17:38 IST 2003,
author=Mark Roth, Eduardo Pelegri-Llopart,
title=[JSP 2.0 Specification, Final Release],
content_type=[application/pdf],
keywords=JSP,
age=30,
last_name=cookbook,
first_name=apachesolr,
id=jsp-2_0-fr-spec.pdf
}]
它将如何获得标题、作者、关键字...等?