1

For example, I have a input file like this:

xxx,14
yyy,20
zzz,11

I want to SUM for the second filds and output. Now I know how to SUM it by using Hadoop Pig, but I want the output like this:

Canada,45

So it means that I set the key name to "Canada" by myself and add the SUM as the value. How can I set the key name by myself?

4

1 回答 1

2

只需应用一个常量字段:

A = load 'data.txt' using PigStorage(',') as (txt:chararray, num:int);
B = group A ALL;
C = foreach B generate 'Canada' as country:chararray, SUM(A.num) as total:int;
于 2013-08-08T20:55:36.143 回答