0

我正在尝试从一些博客中获取访问量最大的 IP。

样本输入:

323.81.303.680 - - [25/Oct/2011:01:41:00 -0500] "GET /download/download6.zip HTTP/1.1" 200 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.19) Gecko/2010031422 Firefox/3.0.19"
668.667.44.3 - - [25/Oct/2011:07:38:30 -0500] "GET /download/download3.zip HTTP/1.1" 200 0 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.12) Gecko/20070719 CentOS/1.5.0.12-3.el5.centos Firefox/1.5.0.12"
13.386.648.380 - - [25/Oct/2011:17:06:00 -0500] "GET /download/download6.zip HTTP/1.1" 200 0 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.3; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; InfoPath.2)"
06.670.03.40 - - [26/Oct/2011:13:24:00 -0500] "GET /product/demos/product2 HTTP/1.1" 200 0 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"

脚本:

D = LOAD 'weblogs_rebuild.txt' USING PigStorage(' ') as 
    (client_ip: chararray,
     indents1: chararray,...
    );
F = Group D by client_ip;
C = foreach F generate COUNT(D) AS count, group;
A = ORDER C by count DESC;

在我的脚本中,我似乎很好,C 的转储让我得到如下输出:

(2,688.644.363.338)
(27,688.645.642.675)
(11,688.646.612.331)

并调用 describe 让我得到这个:

grunt> describe A
A: {count: long,group: chararray}
grunt> describe C
C: {count: long,group: chararray}

但是当我转储 AI 得到错误:

2013-07-30 15:53:40,434 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias A

这是日志中的相关部分

猪堆栈跟踪

ERROR 1066:无法打开别名 A 的迭代器

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias A
    at org.apache.pig.PigServer.openIterator(PigServer.java:836)
    at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:696)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:320)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
    at org.apache.pig.Main.run(Main.java:538)
    at org.apache.pig.Main.main(Main.java:157)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.io.IOException: Job terminated with anomalous status FAILED
    at org.apache.pig.PigServer.openIterator(PigServer.java:828)
    ... 12 more

我的猪版本是 .11.1

4

0 回答 0