我分别使用以下 java 代码和 pig 代码运行 java 嵌入式 pig 伪模式..
版本:
Hadoop 版本 - 2.0.0-cdh4.1.2
猪版 - 0.10.0-cdh4.1.2
Java代码:
public class CallPig {
public static void main(String[] args) {
try {
Properties props = new Properties();
props.setProperty("fs.default.name", "hdfs://localhost:8020");
props.setProperty("mapred.job.tracker", "hdfs://localhost:8021");
Configuration conf = new Configuration();
System.out.println(conf);
System.out.println(conf.get("fs.default.name"));
PigServer pig = new PigServer(ExecType.MAPREDUCE, props);
// pig.debugOn();
pig.registerScript("scripts/loadScript.pig");
}
catch(Exception e) {
e.printStackTrace();
}
}
}
猪代码:
register '/usr/lib/pig/piggybank.jar';
a = load 'sample.txt' using PigStorage(';') as (fname:chararray, lname:chararray);
b = foreach a generate $0;
store b into 'file';
但是我收到错误,因为 pigstats.PigStatsUtil: 0 map reduce job(s) failed!
这是我的完整堆栈跟踪...
13/04/23 14:36:12 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
13/04/23 14:36:12 INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: hdfs://localhost:8020
13/04/23 14:36:13 INFO executionengine.HExecutionEngine: Connecting to map-reduce job tracker at: hdfs://localhost:8021
13/04/23 14:36:13 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
13/04/23 14:36:13 WARN conf.Configuration: dfs.df.interval is deprecated. Instead, use fs.df.interval
13/04/23 14:36:13 WARN conf.Configuration: dfs.max.objects is deprecated. Instead, use dfs.namenode.max.objects
13/04/23 14:36:13 WARN conf.Configuration: dfs.name.dir.restore is deprecated. Instead, use dfs.namenode.name.dir.restore
13/04/23 14:36:13 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
13/04/23 14:36:13 WARN conf.Configuration: dfs.https.client.keystore.resource is deprecated. Instead, use dfs.client.https.keystore.resource
13/04/23 14:36:13 WARN conf.Configuration: dfs.backup.address is deprecated. Instead, use dfs.namenode.backup.address
13/04/23 14:36:13 WARN conf.Configuration: dfs.backup.http.address is deprecated. Instead, use dfs.namenode.backup.http-address
13/04/23 14:36:13 WARN conf.Configuration: dfs.data.dir is deprecated. Instead, use dfs.datanode.data.dir
13/04/23 14:36:13 WARN conf.Configuration: dfs.name.dir is deprecated. Instead, use dfs.namenode.name.dir
13/04/23 14:36:13 WARN conf.Configuration: dfs.permissions is deprecated. Instead, use dfs.permissions.enabled
13/04/23 14:36:13 WARN conf.Configuration: dfs.safemode.extension is deprecated. Instead, use dfs.namenode.safemode.extension
13/04/23 14:36:13 WARN conf.Configuration: dfs.datanode.max.xcievers is deprecated. Instead, use dfs.datanode.max.transfer.threads
13/04/23 14:36:13 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
13/04/23 14:36:13 WARN conf.Configuration: fs.checkpoint.dir is deprecated. Instead, use dfs.namenode.checkpoint.dir
13/04/23 14:36:13 WARN conf.Configuration: dfs.https.need.client.auth is deprecated. Instead, use dfs.client.https.need-auth
13/04/23 14:36:13 WARN conf.Configuration: dfs.block.size is deprecated. Instead, use dfs.blocksize
13/04/23 14:36:13 WARN conf.Configuration: dfs.access.time.precision is deprecated. Instead, use dfs.namenode.accesstime.precision
13/04/23 14:36:13 WARN conf.Configuration: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
13/04/23 14:36:13 WARN conf.Configuration: dfs.replication.interval is deprecated. Instead, use dfs.namenode.replication.interval
13/04/23 14:36:13 WARN conf.Configuration: fs.checkpoint.edits.dir is deprecated. Instead, use dfs.namenode.checkpoint.edits.dir
13/04/23 14:36:13 WARN conf.Configuration: dfs.replication.min is deprecated. Instead, use dfs.namenode.replication.min
13/04/23 14:36:13 WARN conf.Configuration: dfs.write.packet.size is deprecated. Instead, use dfs.client-write-packet-size
13/04/23 14:36:13 WARN conf.Configuration: dfs.name.edits.dir is deprecated. Instead, use dfs.namenode.edits.dir
13/04/23 14:36:13 WARN conf.Configuration: dfs.replication.considerLoad is deprecated. Instead, use dfs.namenode.replication.considerLoad
13/04/23 14:36:13 WARN conf.Configuration: dfs.balance.bandwidthPerSec is deprecated. Instead, use dfs.datanode.balance.bandwidthPerSec
13/04/23 14:36:13 WARN conf.Configuration: dfs.permissions.supergroup is deprecated. Instead, use dfs.permissions.superusergroup
13/04/23 14:36:13 WARN conf.Configuration: dfs.safemode.threshold.pct is deprecated. Instead, use dfs.namenode.safemode.threshold-pct
13/04/23 14:36:13 WARN conf.Configuration: topology.script.number.args is deprecated. Instead, use net.topology.script.number.args
13/04/23 14:36:13 WARN conf.Configuration: dfs.secondary.http.address is deprecated. Instead, use dfs.namenode.secondary.http-address
13/04/23 14:36:13 WARN conf.Configuration: dfs.http.address is deprecated. Instead, use dfs.namenode.http-address
13/04/23 14:36:13 WARN conf.Configuration: fs.checkpoint.period is deprecated. Instead, use dfs.namenode.checkpoint.period
13/04/23 14:36:13 WARN conf.Configuration: topology.node.switch.mapping.impl is deprecated. Instead, use net.topology.node.switch.mapping.impl
13/04/23 14:36:13 WARN conf.Configuration: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
13/04/23 14:36:14 INFO pigstats.ScriptState: Pig features used in the script: UNKNOWN
13/04/23 14:36:14 INFO rules.ColumnPruneVisitor: Columns pruned for a: $1
13/04/23 14:36:14 INFO mapReduceLayer.MRCompiler: File concatenation threshold: 100 optimistic? false
13/04/23 14:36:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1
13/04/23 14:36:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1
13/04/23 14:36:14 INFO pigstats.ScriptState: Pig script settings are added to the job
13/04/23 14:36:14 INFO mapReduceLayer.JobControlCompiler: mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
13/04/23 14:36:14 INFO mapReduceLayer.JobControlCompiler: creating jar file Job9023414129086454587.jar
13/04/23 14:36:16 INFO mapReduceLayer.JobControlCompiler: jar file Job9023414129086454587.jar created
13/04/23 14:36:16 INFO mapReduceLayer.JobControlCompiler: Setting up single store job
13/04/23 14:36:16 INFO mapReduceLayer.MapReduceLauncher: 1 map-reduce job(s) waiting for submission.
13/04/23 14:36:16 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/04/23 14:36:17 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
13/04/23 14:36:17 WARN conf.Configuration: dfs.safemode.extension is deprecated. Instead, use dfs.namenode.safemode.extension
13/04/23 14:36:17 WARN conf.Configuration: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
13/04/23 14:36:17 INFO input.FileInputFormat: Total input paths to process : 1
13/04/23 14:36:17 INFO util.MapRedUtil: Total input paths to process : 1
13/04/23 14:36:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/04/23 14:36:17 INFO util.MapRedUtil: Total input paths (combined) to process : 1
13/04/23 14:36:17 INFO mapReduceLayer.MapReduceLauncher: 0% complete
13/04/23 14:36:17 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost:8020/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/hduser/.staging/job_201304231219_0009
13/04/23 14:36:17 INFO mapReduceLayer.MapReduceLauncher: 100% complete
13/04/23 14:36:17 ERROR pigstats.PigStatsUtil: 0 map reduce job(s) failed!
13/04/23 14:36:17 INFO pigstats.SimplePigStats: Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.0.0-cdh4.1.2 0.10.0-cdh4.1.2 hduser 2013-04-23 14:36:14 2013-04-23 14:36:17 UNKNOWN
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
Input(s):
Output(s):
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
null
不知道我错在哪里..
提前致谢