我写了一个pig script
正常的。我可以在 hdfs 的输出目录中看到 pig 脚本的结果。但在我的控制台快结束时,我看到以下内容:
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_local1695568121_0002 1 1 0 0 0 0 0 0 0 0 words_sorted SAMPLER
job_local2103470491_0003 1 1 0 0 0 0 0 0 0 0 words_sorted ORDER_BY /output/result_pig,
job_local696057848_0001 1 1 0 0 0 0 0 0 0 0 book,words,words_agg,words_grouped GROUP_BY,COMBINER
Successfully read 0 records from: "/data/pg5000.txt"
Successfully stored 0 records in: "/output/result_pig"
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local696057848_0001 -> job_local1695568121_0002,
job_local1695568121_0002 -> job_local2103470491_0003,
2014-07-01 14:10:35,241 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
and output(s)
。他们俩都说successfully read/stored 0 records
我在用hadoop2.2 and pig-0.12
book = load '/data/pg5000.txt' using PigStorage() as (lines:chararray);
words = foreach book generate FLATTEN(TOKENIZE(lines)) as word;
words_grouped = group words by word;
words_agg = foreach words_grouped generate group as word, COUNT(words);
words_sorted = ORDER words_agg BY $1 DESC;
STORE words_sorted into '/output/result_pig' using PigStorage(':','-schema');
hadoop fs -cat /data/pg5000.txt | head -10
The Project Gutenberg EBook of The Notebooks of Leonardo Da Vinci, Complete
by Leonardo Da Vinci
(#3 in our series by Leonardo Da Vinci)
Copyright laws are changing all over the world. Be sure to check the
copyright laws for your country before downloading or redistributing
this or any other Project Gutenberg eBook.
This header should be the first thing seen when viewing this Project
Gutenberg file. Please do not remove it. Do not change or edit the
cat: Unable to write to output stream.