0

版本:
Accumulo 1.5
猪 0.10

尝试:
使用 accumulo-pig 从 Pig 读取/写入 Accumulo 中的数据。
遇到错误 - 非常感谢您对克服此错误的任何见解。
切换到 Accumulo 1.4 不是一种选择,因为我们在 C# 代码库中使用了 Accumulo Thrift 代理。

影响:
这是目前我们项目中的一个障碍。

来源参考:
源代码 - https://git-wip-us.apache.org/repos/asf/accumulo-pig.git

错误:
在尝试从 Pig 读取 Accumulo 中的数据集时,我收到以下错误 -

org.apache.pig.backend.executionengine.ExecException:错误 2118:AccumuloInputFormat 的连接器信息只能为每个作业设置一次

代码片段:

DATA = LOAD 'accumulo://departments?instance=indra&user=root&password=xxxxxxx&zookeepers=cdh-dn01:2181' using org.apache.accumulo.pig.AccumuloStorage() AS (row, cf, cq, cv, ts, val);
dump DATA;
4

1 回答 1

0

Try using the ACCUMULO-1783-1.5 branch from the same repository. The way that Pig sets up the InputFormat doesn't play nicely with how Accumulo sets up InputFormats (notably, Accumulo makes a funny assertion that you never call the same static method more than one for a Configuration).

I have been using pig 0.12 -- I doubt there's a difference in how 0.10 sets up the InputFormats as opposed to 0.12, but I'm not positive YMMV.

I just pushed a fix to the above branch that gets rid of the previously mentioned limitation on Hadoop version.

于 2013-12-30T17:49:04.197 回答