0

I am getting errors in AVG function. Can anyone please help on the following script: (Do i need to use tuple or bag while loading?) Thanks.

mydata = LOAD 'bigdata.txt' USING PigStorage(',')  AS (stn , wban, yearmoda,   temp, a ,  dewp :double, b ,  slp :double,  c,  stp :double, d, visib :double, e,  wdsp :double,  f, mxspd :double,  gust :double,   max :double,  min :double, prcp :double, sndp :double, frshtt);

clean1 = FOREACH mydata GENERATE stn , wban, yearmoda,   temp, a ,  dewp, b ,  slp,  c,  stp, d, visib, e,  wdsp,  f, mxspd,  gust,   max ,  min, prcp  ,sndp , frshtt;

--clean2 = FILTER clean1 BY (temp == 9999.9);

tmpdata = FOREACH clean1 GENERATE stn, SUBSTRING(yearmoda, 0, 5) as year, temp;
C = GROUP tmpdata BY (year, temp);

avgtemp = FOREACH C GENERATE group, AVG(temp);
4

1 回答 1

0

编辑数据temp时没有指定类型。LOAD因此,当 Pig 尝试调用该AVG函数并检查要使用哪个版本时(例如,如果字段是 anint而不是 a double,它的行为必须不同),它无法告诉如何继续。在你的陈述中给出temp一个类型(如temp:int) ,它应该可以工作。LOAD

在您的情况下,您也没有正确指定该字段。你需要传递AVG一个包来评估。temp您可以通过将字段投影到C. 的架构C{(group:(year,temp)), tmpdata:{(stn,year:chararray,temp)})},所以你需要这样计算avgtemp

avgtemp = FOREACH C GENERATE group, AVG(tmpdata.temp);
于 2013-06-27T19:16:01.633 回答