1

我必须使用 Nutch 2.3.1 设置 hadoop 堆栈。hadoop 2.7.4 支持的 Hbase 版本是 1.2.6,我已经配置并测试成功。但是当我编译 Nutch 时,我得到了关注并抓取了一个示例页面,我得到了这个错误。

/usr/local/nutch/runtime/local/bin/nutch inject urls/ -crawlId kics
InjectorJob: starting at 2017-09-21 14:20:10
InjectorJob: Injecting urlDir: urls
Exception in thread "main" java.lang.NoSuchFieldError: HBASE_CLIENT_PREFETCH_LIMIT
    at org.apache.hadoop.hbase.client.HConnectionKey.<clinit>(HConnectionKey.java:43)
    at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:267)
    at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:194)
    at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:115)
    at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
    at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
    at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
Error running:

根据我的thisthis等搜索,Hbase 1.x可以编译为Nutch 2.3.1。但是如何编译我不知道。有人可以指导(步骤等)

4

1 回答 1

1

Apache Gora 0.7 是支持 HBase 1.2.3(+) 的版本: https ://issues.apache.org/jira/browse/GORA-443

您可以查看https://stackoverflow.com/a/39837926/582789,我在其中写了如何修改 Nutch 2.3.1 以使用 Apache Gora 0.7。关于该答案中的补丁https://paste.apache.org/jjqz,在显示“0.7-SNAPSHOT”的位置使用“0.7”。

顺便说一句,Apache Gora 0.8 昨天发布了 :) 只需将 0.7 更改为 0.8 就可以了。

http://gora.apache.org/#20-september-2017-apache-gora-08-release

于 2017-09-21T09:59:53.250 回答