0

我分别使用以下 java 代码和 pig 代码运行 java 嵌入式 pig 伪模式..

版本:

Hadoop 版本 - 2.0.0-cdh4.1.2

猪版 - 0.10.0-cdh4.1.2

Java代码:

public class CallPig {  

    public static void main(String[] args) {
    try {       
        Properties props = new Properties();
        props.setProperty("fs.default.name", "hdfs://localhost:8020");
        props.setProperty("mapred.job.tracker", "hdfs://localhost:8021");


        Configuration conf = new Configuration();
        System.out.println(conf);
        System.out.println(conf.get("fs.default.name"));

        PigServer pig = new PigServer(ExecType.MAPREDUCE, props);       
      //  pig.debugOn();
        pig.registerScript("scripts/loadScript.pig");

        }
        catch(Exception e) {
            e.printStackTrace();
        }
     }
}

猪代码:

register '/usr/lib/pig/piggybank.jar';

a = load 'sample.txt' using PigStorage(';') as (fname:chararray, lname:chararray);

b = foreach a generate $0;

store b into 'file';

但是我收到错误,因为 pigstats.PigStatsUtil: 0 map reduce job(s) failed!

这是我的完整堆栈跟踪...

13/04/23 14:36:12 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS

13/04/23 14:36:12 INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: hdfs://localhost:8020

13/04/23 14:36:13 INFO executionengine.HExecutionEngine: Connecting to map-reduce job tracker at: hdfs://localhost:8021

13/04/23 14:36:13 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS

13/04/23 14:36:13 WARN conf.Configuration: dfs.df.interval is deprecated. Instead, use fs.df.interval

13/04/23 14:36:13 WARN conf.Configuration: dfs.max.objects is deprecated. Instead, use dfs.namenode.max.objects

13/04/23 14:36:13 WARN conf.Configuration: dfs.name.dir.restore is deprecated. Instead, use dfs.namenode.name.dir.restore

13/04/23 14:36:13 WARN conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available

13/04/23 14:36:13 WARN conf.Configuration: dfs.https.client.keystore.resource is deprecated. Instead, use dfs.client.https.keystore.resource

13/04/23 14:36:13 WARN conf.Configuration: dfs.backup.address is deprecated. Instead, use dfs.namenode.backup.address

13/04/23 14:36:13 WARN conf.Configuration: dfs.backup.http.address is deprecated. Instead, use dfs.namenode.backup.http-address

13/04/23 14:36:13 WARN conf.Configuration: dfs.data.dir is deprecated. Instead, use dfs.datanode.data.dir

13/04/23 14:36:13 WARN conf.Configuration: dfs.name.dir is deprecated. Instead, use dfs.namenode.name.dir

13/04/23 14:36:13 WARN conf.Configuration: dfs.permissions is deprecated. Instead, use dfs.permissions.enabled

13/04/23 14:36:13 WARN conf.Configuration: dfs.safemode.extension is deprecated. Instead, use dfs.namenode.safemode.extension

13/04/23 14:36:13 WARN conf.Configuration: dfs.datanode.max.xcievers is deprecated. Instead, use dfs.datanode.max.transfer.threads

13/04/23 14:36:13 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS

13/04/23 14:36:13 WARN conf.Configuration: fs.checkpoint.dir is deprecated. Instead, use dfs.namenode.checkpoint.dir

13/04/23 14:36:13 WARN conf.Configuration: dfs.https.need.client.auth is deprecated. Instead, use dfs.client.https.need-auth

13/04/23 14:36:13 WARN conf.Configuration: dfs.block.size is deprecated. Instead, use dfs.blocksize

13/04/23 14:36:13 WARN conf.Configuration: dfs.access.time.precision is deprecated. Instead, use dfs.namenode.accesstime.precision

13/04/23 14:36:13 WARN conf.Configuration: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address

13/04/23 14:36:13 WARN conf.Configuration: dfs.replication.interval is deprecated. Instead, use dfs.namenode.replication.interval

13/04/23 14:36:13 WARN conf.Configuration: fs.checkpoint.edits.dir is deprecated. Instead, use dfs.namenode.checkpoint.edits.dir
13/04/23 14:36:13 WARN conf.Configuration: dfs.replication.min is deprecated. Instead, use dfs.namenode.replication.min
13/04/23 14:36:13 WARN conf.Configuration: dfs.write.packet.size is deprecated. Instead, use dfs.client-write-packet-size
13/04/23 14:36:13 WARN conf.Configuration: dfs.name.edits.dir is deprecated. Instead, use dfs.namenode.edits.dir
13/04/23 14:36:13 WARN conf.Configuration: dfs.replication.considerLoad is deprecated. Instead, use dfs.namenode.replication.considerLoad

13/04/23 14:36:13 WARN conf.Configuration: dfs.balance.bandwidthPerSec is deprecated. Instead, use dfs.datanode.balance.bandwidthPerSec

13/04/23 14:36:13 WARN conf.Configuration: dfs.permissions.supergroup is deprecated. Instead, use dfs.permissions.superusergroup

13/04/23 14:36:13 WARN conf.Configuration: dfs.safemode.threshold.pct is deprecated. Instead, use dfs.namenode.safemode.threshold-pct

13/04/23 14:36:13 WARN conf.Configuration: topology.script.number.args is deprecated. Instead, use net.topology.script.number.args

13/04/23 14:36:13 WARN conf.Configuration: dfs.secondary.http.address is deprecated. Instead, use dfs.namenode.secondary.http-address

13/04/23 14:36:13 WARN conf.Configuration: dfs.http.address is deprecated. Instead, use dfs.namenode.http-address

13/04/23 14:36:13 WARN conf.Configuration: fs.checkpoint.period is deprecated. Instead, use dfs.namenode.checkpoint.period

13/04/23 14:36:13 WARN conf.Configuration: topology.node.switch.mapping.impl is deprecated. Instead, use net.topology.node.switch.mapping.impl

13/04/23 14:36:13 WARN conf.Configuration: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum

13/04/23 14:36:14 INFO pigstats.ScriptState: Pig features used in the script: UNKNOWN

13/04/23 14:36:14 INFO rules.ColumnPruneVisitor: Columns pruned for a: $1

13/04/23 14:36:14 INFO mapReduceLayer.MRCompiler: File concatenation threshold: 100 optimistic? false

13/04/23 14:36:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1

13/04/23 14:36:14 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1

13/04/23 14:36:14 INFO pigstats.ScriptState: Pig script settings are added to the job

13/04/23 14:36:14 INFO mapReduceLayer.JobControlCompiler: mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

13/04/23 14:36:14 INFO mapReduceLayer.JobControlCompiler: creating jar file Job9023414129086454587.jar

13/04/23 14:36:16 INFO mapReduceLayer.JobControlCompiler: jar file Job9023414129086454587.jar created

13/04/23 14:36:16 INFO mapReduceLayer.JobControlCompiler: Setting up single store job

13/04/23 14:36:16 INFO mapReduceLayer.MapReduceLauncher: 1 map-reduce job(s) waiting for submission.

13/04/23 14:36:16 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.

13/04/23 14:36:17 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS

13/04/23 14:36:17 WARN conf.Configuration: dfs.safemode.extension is deprecated. Instead, use dfs.namenode.safemode.extension

13/04/23 14:36:17 WARN conf.Configuration: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum

13/04/23 14:36:17 INFO input.FileInputFormat: Total input paths to process : 1

13/04/23 14:36:17 INFO util.MapRedUtil: Total input paths to process : 1

13/04/23 14:36:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

13/04/23 14:36:17 INFO util.MapRedUtil: Total input paths (combined) to process : 1

13/04/23 14:36:17 INFO mapReduceLayer.MapReduceLauncher: 0% complete

13/04/23 14:36:17 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost:8020/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/hduser/.staging/job_201304231219_0009

13/04/23 14:36:17 INFO mapReduceLayer.MapReduceLauncher: 100% complete

13/04/23 14:36:17 ERROR pigstats.PigStatsUtil: 0 map reduce job(s) failed!

13/04/23 14:36:17 INFO pigstats.SimplePigStats: Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.0.0-cdh4.1.2  0.10.0-cdh4.1.2 hduser  2013-04-23 14:36:14 2013-04-23 14:36:17 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs

Input(s):

Output(s):

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
null

不知道我错在哪里..

提前致谢

4

1 回答 1

-1

通过添加 /hadoop-0.20-mapreduce/lib/native/Linux-amd64-64 作为源而不是 jar 解决了“无法加载本机 hadoop 库”的问题。

于 2013-04-30T07:35:47.763 回答