我已经按照 nutch2 教程成功地将 nutch 与 HBase 集成我的问题是当我
./nutch crawl urls/seed.txt abc -depth 50 -topN 50
在runtime/local/bin
目录中使用以下命令抓取 url 时,
发生了错误 :
Exception in thread "main" java.lang.RuntimeException: job failed: name=generate: null, jobid=job_local1552667151_0002
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:199)
at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:152)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
请给我解决方案。任何解决方案将不胜感激。