我有以下格式的提要:
Hour Key ID Value
1 K1 001 3
1 K1 002 2
2 K1 005 4
1 K2 002 1
2 K2 003 5
2 K2 004 6
我想对提要进行分组,(Hour, Key)
然后求和Value
但保留ID
为元组:
({1, K1}, {001, 002}, 5)
({2, K1}, {005}, 4)
({1, K2}, {002}, 1)
({2, K2}, {003, 004}, 11)
我知道如何使用FLATTEN
生成的总和,Value
但不知道如何输出ID
为元组。这是我到目前为止所拥有的:
A = LOAD 'data' AS (Hour:chararray, Key:chararray, ID:chararray, Value:int);
B = GROUP A BY (Hour, Key);
C = FOREACH B GENERATE
FLATTEN(group) AS (Hour, Key),
SUM(A.Value) AS Value
;
你会解释如何做到这一点?欣赏它!