1

我已经将 nutch 与 hadoop 集成并用它进行了爬行。但是当我想为数据创建 solrindex 时,我得到了 Job Failed。这是我的命令:

$bin/nutch solrindex http://namenode:8983/solr/ crawl/crawldb -linkdb crawl/linkdb crawl/segments/*


'crawl' 目录存储数据,它位于 HDFS 中。我只是在 solr 版本中使用了这个例子。在单节点模式(不是分布式模式)下运行命令时它工作正常。这是输出的片段,请原谅我以如此混乱的方式发布它:

12/04/28 13:51:22 INFO mapred.JobClient:  map 100% reduce 23%
12/04/28 13:51:30 INFO mapred.JobClient: Task Id : attempt_201204212112_0076_r_000000_0, Status : FAILED
java.io.IOException
    at org.apache.nutch.indexer.solr.SolrWriter.makeIOException(SolrWriter.java:103)
    at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:98)
    at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
    at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:466)
    at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Unknown Source)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: org.apache.solr.client.solrj.SolrServerException: java.net.ConnectException: Connection refused
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:478)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
    at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
    at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
    at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:93)
    ... 9 more
Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(Unknown Source)
    at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
    at java.net.PlainSocketImpl.connect(Unknown Source)
    at java.net.SocksSocketImpl.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at java.net.Socket.connect(Unknown Source)
    at java.net.Socket.<init>(Unknown Source)
    at java.net.Socket.<init>(Unknown Source)
    at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:79)
    at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:121)
    at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:706)
    at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:386)
    at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:324)
    at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:422)
    ... 13 more
4

0 回答 0