hadoop - Pig is not running in mapreduce mood (hadoop 3.1.1 + pig 0.17.0)

Question

I am very new to Hadoop. My hadoop version is 3.1.1 and pig version is 0.17.0.

Everything is working as expected by running this script in local mode

pig -x local

grunt> student = LOAD '/home/ubuntu/sharif_data/student.txt' USING PigStorage(',') as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );
grunt> DUMP student;

Result for local mode

But for the same input file and pig script, mapreduce mode is not working successfully.

pig -x mapreduce

grunt> student = LOAD '/pig_data/student.txt' USING PigStorage(',') AS ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );
grunt> STORE student INTO '/pig_data/student_out' USING PigStorage (',');

OR

grunt> student = LOAD 'hdfs://NND1:9000/pig_data/student.txt' USING PigStorage(',') AS ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray );
grunt> STORE student INTO 'hdfs://NND1:9000/pig_data/student_out' USING PigStorage (',');

Result for mapreduce mode OR Note: student.txt is uploaded to HDFS successfully.

hdfs dfs -ls  /pig_data 
Found 2 items
-rw-r--r--   3 ubuntu supergroup     861585 2019-07-12 00:55 /pig_data/en.sahih.txt
-rw-r--r--   3 ubuntu supergroup        234 2019-07-12 12:25 /pig_data/student.txt

Even under grunt this command returns correct HDFS file name.

grunt> fs -cat /pig_data/student.txt

Why is it saying failed to read data when the file exists in that path?
What could be the possible reasons that I am missing?

Any help is appreciated.

score 4 · Accepted Answer

部分问题是 Pig 0.17 还不支持 Hadoop 3。

0.17的Apache Pig Releases状态：

2017 年 6 月 19 日：发布 0.17.0

此版本的亮点是在 Spark 上引入 Pig

注意：此版本适用于 Hadoop 2.X（高于 2.7.x）

JIRA PIG-5253 - Pig Hadoop 3 支持仍在进行中。

hadoop - Pig is not running in mapreduce mood (hadoop 3.1.1 + pig 0.17.0)

1 回答 1

Related

Reference