2

我需要加入多个表。我正在使用的命令如下:

G = JOIN aa BY f, bb by f, cc by f, dd by f;

为了使它成为一个完整的外部连接,我添加了一个FULL来使它:

G = JOIN aa BY f FULL, bb by f, cc by f, dd by f;

但它给了我一个mismatched input错误信息。我应该如何进行这项工作?

谢谢!

4

2 回答 2

7

根据 Pig文档

外连接仅适用于双向连接;要执行多路外连接,您需要执行多个双向外连接语句。

于 2012-10-25T21:56:21.960 回答
1

您可以使用 COGROUP 语句来模拟完全外连接。例如 cogroup 使用以下两个文件

十进制.csv

first|1
second|2
fourth|4

罗马.csv

first|I 
second|II
third|III

猪命令:

english = LOAD 'Decimal.csv' using PigStorage('|') as (name:chararray,value:chararray);
roman = LOAD 'Roman.csv' using PigStorage('|') as (name:chararray, value:chararray);
multi = cogroup english by name, roman by name;
dump multi

输出:

(first,{(first,1)},{(first,I)})
(third,{},{(third,III)})
(fourth,{(fourth,4)},{})
(second,{(second,2)},{(second,II)})
于 2013-12-31T09:42:11.813 回答