在我的 nutch-site.xml 中,我添加以下内容以停止截断;但是,在获取过程中,我收到以下错误。我希望它停止截断并提供我需要的结果,我假设 -1 值可以实现。我使用的是 2.2.1 版。有任何想法吗?
<property>
<name>http.content.limit</name>
<value>-1</value>
<description>The length limit for downloaded content using the http
protocol, in bytes. If this value is nonnegative (>=0), content longer
than it will be truncated; otherwise, no truncation at all. Do not
confuse this setting with the file.content.limit setting.
</description>
</property>
线程“主”java.lang.RuntimeException 中的异常:作业失败:在 org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:55) 在 org.apache.nutch.fetcher 的 name=fetch, jobid=job_local1185573074_0001。 FetcherJob.run(FetcherJob.java:194) at org.apache.nutch.fetcher.FetcherJob.fetch(FetcherJob.java:219) at org.apache.nutch.fetcher.FetcherJob.run(FetcherJob.java:301) at org .apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) 在 org.apache.nutch.fetcher.FetcherJob.main(FetcherJob.java:307)