我正在尝试将我的 Nutch 爬行索引到 solr,但在源代码内部,而不是从命令行。
我创建了以下功能
public static int runInjectSolr(String[] args, Properties prop) throws Exception{
String solrUrl = "http://ec2-X-X-X-X.compute-1.amazonaws.com/solr/collection1";
String crawldb = JobBase.getParam(args,"crawldb", null, true);
String segments = JobBase.getParam(args,"segments", null, true);
String args2[] = {crawldb, segments};
Configuration conf = new Configuration();
conf.set("-D solr.server.url",solrUrl);
int code = ToolRunner.run(NutchConfiguration.create(),
new IndexingJob(conf), args2);
return code;
}
但我收到以下错误:
2013-08-07 19:37:13,338 ERROR org.apache.nutch.indexwriter.solr.SolrIndexWriter (main): Missing SOLR URL. Should be set via -D solr.server.url
SOLRIndexWriter
solr.server.url : URL of the SOLR instance (mandatory)
solr.commit.size : buffer size when sending to SOLR (default 1000)
solr.mapping.file : name of the mapping file for fields (default solrindex-mapping.xml)
solr.auth : use authentication (default false)
solr.auth.username : use authentication (default false)
solr.auth : username for authentication
solr.auth.password : password for authentication
所以我假设我没有正确创建我的配置。有什么建议么?
或者我应该将我的配置字段传递给以不同的方式运行吗?也许不使用
NutchConfiguration.create()