我在运行猪流媒体时遇到问题。当我启动一个交互式猪实例(仅供参考,我通过 SSH/Putty 在交互式猪 AWS EMR 实例的主节点上执行此操作)时,只有一台机器我的猪流工作完美(它也适用于我的 windows cloudera VM 映像)。但是,当我切换到使用多台计算机时,它会停止工作并给出各种错误。
注意:
- 我能够在多计算机实例上运行没有任何流命令的 Pig 脚本,没有问题。
- 我所有的猪工作都是在猪 MapReduce 模式下完成的,而不是 –x 本地模式。
- 我的 python 脚本 (stream1.py) 在顶部有这个 #!/usr/bin/env python
下面是我迄今为止尝试过的选项的小样本(以下所有命令都是在主/主节点上的 grunt shell 中完成的,我通过 ssh/putty 访问它):
这就是我将 python 文件放到母节点上的方式,以便可以使用它:
cp s3n://darin.emr-logs/stream1.py stream1.py
copyToLocal stream1.py /home/hadoop/stream1.py
chmod 755 stream1.py
这些是我的各种流尝试:
cooc = stream ct_pag_ph through `stream1.py`
dump coco;
ERROR 2090: Received Error while processing the reduce plan: 'stream1.py ' failed with exit status: 127
cooc = stream ct_pag_ph through `python stream1.py`;
dump coco;
ERROR 2090: Received Error while processing the reduce plan: 'python stream1.py ' failed with exit status: 2
DEFINE X `stream1.py`;
cooc = stream ct_bag_ph through X;
dump coco;
ERROR 2090: Received Error while processing the reduce plan: 'stream1.py ' failed with exit status: 127
DEFINE X `stream1.py`;
cooc = stream ct_bag_ph through `python X`;
dump coco;
ERROR 2090: Received Error while processing the reduce plan: 'python X ' failed with exit status: 2
DEFINE X `stream1.py` SHIP('stream1.py');
cooc = STREAM ct_bag_ph THROUGH X;
dump cooc;
ERROR 2017: Internal error creating job configuration.
DEFINE X `stream1.py` SHIP('/stream1.p');
cooc = STREAM ct_bag_ph THROUGH X;
dump cooc;
DEFINE X `stream1.py` SHIP('stream1.py') CACHE('stream1.py');
cooc = STREAM ct_bag_ph THROUGH X;
ERROR 2017: Internal error creating job configuration.
define X 'python /home/hadoop/stream1.py' SHIP('/home/hadoop/stream1.py');
cooc = STREAM ct_bag_ph THROUGH X;