我想做一个涉及 3 个表的外部联接。我试过这个:
features = JOIN group_event by group left outer, group_session by group, group_order by group;
我希望 group_event 的所有行都出现在输出中,即使其他 2 个关系中的一个或没有一个与之匹配。
上面的命令不起作用。显然因为它不应该工作(http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#JOIN+%28outer%29)
Outer joins will only work for two-way joins; to perform a multi-way outer join, you will need to perform multiple two-way outer join statements.
拆分有效,可以像这样完成:
features1 = JOIN group_event by group left outer, group_session by group;
features2 = JOIN features1 by group_event::group left outer, group_order by group;
在单个命令中执行此操作的任何想法?(如果要加入更多数量的表格会很有用)