1

我根据此站点http://wiki.apache.org/nutch/RunNutchInEclipse在 eclips 中配置我的 nutch 代码, 但出现错误“java.io.IOException:无法设置路径权限:\tmp\hadoop”所以我搬到hadoop 0.20.2 但是在更改 hadoop jar 之后我遇到了这个错误:-

2013-05-17 00:44:17,742 WARN  crawl.Crawl (Crawl.java:run(97)) - solrUrl is not set, indexing will be skipped...
2013-05-17 00:44:17,876 INFO  crawl.Crawl (Crawl.java:run(108)) - crawl started in: crawl
2013-05-17 00:44:17,876 INFO  crawl.Crawl (Crawl.java:run(109)) - rootUrlDir = urls
2013-05-17 00:44:17,876 INFO  crawl.Crawl (Crawl.java:run(110)) - threads = 10
2013-05-17 00:44:17,876 INFO  crawl.Crawl (Crawl.java:run(111)) - depth = 3
2013-05-17 00:44:17,877 INFO  crawl.Crawl (Crawl.java:run(112)) - solrUrl=null
2013-05-17 00:44:17,877 INFO  crawl.Crawl (Crawl.java:run(114)) - topN = 50
2013-05-17 00:44:17,888 INFO  crawl.Injector (Injector.java:inject(257)) - Injector: starting at 2013-05-17 00:44:17
2013-05-17 00:44:17,888 INFO  crawl.Injector (Injector.java:inject(258)) - Injector: crawlDb: crawl/crawldb
2013-05-17 00:44:17,888 INFO  crawl.Injector (Injector.java:inject(259)) - Injector: urlDir: urls
2013-05-17 00:44:17,936 INFO  crawl.Injector (Injector.java:inject(269)) - Injector: Converting injected urls to crawl db entries.
2013-05-17 00:44:17,961 INFO  jvm.JvmMetrics (JvmMetrics.java:init(71)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
2013-05-17 00:44:18,144 WARN  mapred.JobClient (JobClient.java:configureCommandLineOptions(661)) - No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
2013-05-17 00:44:18,176 INFO  mapred.FileInputFormat (FileInputFormat.java:listStatus(192)) - Total input paths to process : 1
2013-05-17 00:44:18,519 INFO  mapred.JobClient (JobClient.java:monitorAndPrintJob(1275)) - Running job: job_local_0001
2013-05-17 00:44:18,521 INFO  mapred.FileInputFormat (FileInputFormat.java:listStatus(192)) - Total input paths to process : 1
2013-05-17 00:44:18,573 INFO  mapred.MapTask (MapTask.java:runOldMapper(347)) - numReduceTasks: 1
2013-05-17 00:44:18,578 INFO  mapred.MapTask (MapTask.java:<init>(776)) - io.sort.mb = 100
2013-05-17 00:44:18,601 INFO  mapred.MapTask (MapTask.java:<init>(788)) - data buffer = 79691776/99614720
2013-05-17 00:44:18,601 INFO  mapred.MapTask (MapTask.java:<init>(789)) - record buffer = 262144/327680
2013-05-17 00:44:18,611 WARN  plugin.PluginRepository (PluginManifestParser.java:getPluginFolder(123)) - Plugins: directory not found: plugins
2013-05-17 00:44:18,612 INFO  plugin.PluginRepository (PluginRepository.java:displayStatus(313)) - Plugin Auto-activation mode: [true]
2013-05-17 00:44:18,612 INFO  plugin.PluginRepository (PluginRepository.java:displayStatus(314)) - Registered Plugins:
2013-05-17 00:44:18,612 INFO  plugin.PluginRepository (PluginRepository.java:displayStatus(317)) -  NONE
2013-05-17 00:44:18,613 INFO  plugin.PluginRepository (PluginRepository.java:displayStatus(324)) - Registered Extension-Points:
2013-05-17 00:44:18,613 INFO  plugin.PluginRepository (PluginRepository.java:displayStatus(326)) -  NONE
2013-05-17 00:44:18,615 WARN  mapred.LocalJobRunner (LocalJobRunner.java:run(256)) - job_local_0001
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.RuntimeException: x point org.apache.nutch.net.URLNormalizer not found.
    at org.apache.nutch.net.URLNormalizers.<init>(URLNormalizers.java:123)
    at org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:74)
    ... 18 more
2013-05-17 00:44:19,520 INFO  mapred.JobClient (JobClient.java:monitorAndPrintJob(1288)) -  map 0% reduce 0%
2013-05-17 00:44:19,523 INFO  mapred.JobClient (JobClient.java:monitorAndPrintJob(1343)) - Job complete: job_local_0001
2013-05-17 00:44:19,524 INFO  mapred.JobClient (Counters.java:log(514)) - Counters: 0
Exception in thread "main" java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
    at org.apache.nutch.crawl.Injector.inject(Injector.java:281)
    at org.apache.nutch.crawl.Crawl.run(Crawl.java:132)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)

我搜索了很多,但没有找到任何可行的解决方案。请建议。

4

1 回答 1

4

2013-05-17 00:44:18,611 WARN plugin.PluginRepository (PluginManifestParser.java:getPluginFolder(123)) - Plugins: directory not found: plugins.

你检查插件目录。并在nutch-site.xml. 您必须在<value></value>标签中添加插件路径,如下所示:

<property>
    <name>plugin.folders</name>
    <value>/home/YOUR-USER/nutch/build/plugins</value>
</property>
于 2013-05-20T13:41:44.060 回答