我需要加入多个表。我正在使用的命令如下:
G = JOIN aa BY f, bb by f, cc by f, dd by f;
为了使它成为一个完整的外部连接,我添加了一个FULL
来使它:
G = JOIN aa BY f FULL, bb by f, cc by f, dd by f;
但它给了我一个mismatched input
错误信息。我应该如何进行这项工作?
谢谢!
我需要加入多个表。我正在使用的命令如下:
G = JOIN aa BY f, bb by f, cc by f, dd by f;
为了使它成为一个完整的外部连接,我添加了一个FULL
来使它:
G = JOIN aa BY f FULL, bb by f, cc by f, dd by f;
但它给了我一个mismatched input
错误信息。我应该如何进行这项工作?
谢谢!
根据 Pig文档:
外连接仅适用于双向连接;要执行多路外连接,您需要执行多个双向外连接语句。
您可以使用 COGROUP 语句来模拟完全外连接。例如 cogroup 使用以下两个文件
十进制.csv
first|1
second|2
fourth|4
罗马.csv
first|I
second|II
third|III
猪命令:
english = LOAD 'Decimal.csv' using PigStorage('|') as (name:chararray,value:chararray);
roman = LOAD 'Roman.csv' using PigStorage('|') as (name:chararray, value:chararray);
multi = cogroup english by name, roman by name;
dump multi
输出:
(first,{(first,1)},{(first,I)})
(third,{},{(third,III)})
(fourth,{(fourth,4)},{})
(second,{(second,2)},{(second,II)})