我有以下内容:
(id:int,名称:chararray)
我按 id 分组,创建一个名称包。我看到在名称包中,可能有一个空值。如何从包中删除空值?
您可以使用嵌套在 FOREACH 中的 FILTER 从 GROUP BY 创建的包中删除元组。
inpt = LOAD '...' as (id: int, names: chararray);
grp = GROUP inpt BY id;
result = FOREACH grp {
no_nulls = FILTER inpt BY names is not null;
GENERATE group, no_nulls;
};
或者只是在分组之前过滤空名称:
inpt = LOAD '...' as (id: int, names: chararray);
no_nulls = FILTER input BY names is not null;
grp = GROUP no_nulls BY id;