apache - 我如何在 apache solr 的 schema.xml 中定义我的字段名称以获取文档文件的名称

Question

我开始使用 solr 5.3.1 运行 solr 服务器：

D:\solr\solr-5.3.1\bin>solr start ;

然后我在独立模式下创建一个核心：

D:\solr\solr-5.3.1\bin>solr create -c mycore

我需要从系统文件（word 和 pdf）中索引，并且架构 API 没有文档的字段“名称”，然后我使用 curl 添加此字段：

curl -X POST -H 'Content-type:application/json' --data-binary '{
  "add-field":{
     "name":"name",
     "type":"text_general",
     "stored":true,
     “indexed”:true }
}' http://localhost:8983/solr/mycore/schema

并重新索引所有 document.with windows SimplepostTools：

D:\solr\solr-5.3.1>java -classpath example\exampledocs\post.jar -Dauto=yes -Dc=mycore -Ddata=files -Drecursive=yes org.apache.solr.util.SimplePostTool D:\Lucene\document ;

但是即使“名称”字段被成功添加，他也是空的；字段标题仅获取 pdf 文档的名称，而不是 msword（.doc 和 .docx）的名称。

然后我选择使用 techproducts 示例进行索引，因为他不使用 schema.xml API，然后我可以修改我的架构：

D:\solr\solr-5.3.1>solr –e techproducts

Techproducts 返回所有 files.xml 索引的名称；

然后，我在 solr_home example/techproducts/solr 中创建一个新核心，并在这个名为 demo 的新核心中使用来自 techproducts 的 schema.xml（内容字段“名称”）和 solrConfig.xml。当我为所有文档编制索引时，字段名称存在，但对于所有索引的文件仍然为空。

我的问题是我如何才能获得每个文档的名称（msword 和 pdf），而不是像字段“id”或字段“ressource_name”这样的路径；我必须创建新的 Typefield 或以另一种方式存在。

apache - 我如何在 apache solr 的 schema.xml 中定义我的字段名称以获取文档文件的名称

0 回答 0

Related

Reference