1

我有以下内容:

(id:int,名称:chararray)

我按 id 分组,创建一个名称包。我看到在名称包中,可能有一个空值。如何从包中删除空值?

4

1 回答 1

1

您可以使用嵌套在 FOREACH 中的 FILTER 从 GROUP BY 创建的包中删除元组。

inpt = LOAD '...' as (id: int, names: chararray);
grp = GROUP inpt BY id;
result = FOREACH grp {
   no_nulls = FILTER inpt BY names is not null;
  GENERATE group, no_nulls;
};

或者只是在分组之前过滤空名称:

inpt = LOAD '...' as (id: int, names: chararray);
no_nulls = FILTER input BY names is not null;
grp = GROUP no_nulls BY id;
于 2013-02-03T08:43:46.327 回答