1

I am working through this tutorial and got to the very last part (with some small changes). Now I am stuck with an error message I can't make sense of.

damian@damian-ThinkPad-T61:~/hadoop-1.1.2$ bin/hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input dft1 -output dft1-out -program bin/word_count

13/06/09 20:17:01 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/06/09 20:17:01 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
13/06/09 20:17:01 WARN snappy.LoadSnappy: Snappy native library not loaded
13/06/09 20:17:01 INFO mapred.FileInputFormat: Total input paths to process : 1
13/06/09 20:17:02 INFO filecache.TrackerDistributedCacheManager: Creating word_count in /tmp/hadoop-damian/mapred/local/archive/7642618178782392982_1522484642_696507214/filebin-work-1867423021697266227 with rwxr-xr-x
13/06/09 20:17:02 INFO filecache.TrackerDistributedCacheManager: Cached bin/word_count as /tmp/hadoop-damian/mapred/local/archive/7642618178782392982_1522484642_696507214/filebin/word_count
13/06/09 20:17:02 INFO filecache.TrackerDistributedCacheManager: Cached bin/word_count as /tmp/hadoop-damian/mapred/local/archive/7642618178782392982_1522484642_696507214/filebin/word_count
13/06/09 20:17:02 INFO mapred.JobClient: Running job: job_local_0001
13/06/09 20:17:02 INFO util.ProcessTree: setsid exited with exit code 0
13/06/09 20:17:02 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@4200d3
13/06/09 20:17:02 INFO mapred.MapTask: numReduceTasks: 1
13/06/09 20:17:02 INFO mapred.MapTask: io.sort.mb = 100
13/06/09 20:17:02 INFO mapred.MapTask: data buffer = 79691776/99614720
13/06/09 20:17:02 INFO mapred.MapTask: record buffer = 262144/327680
13/06/09 20:17:02 WARN mapred.LocalJobRunner: job_local_0001
java.lang.NullPointerException
    at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:103)
    at org.apache.hadoop.mapred.pipes.PipesMapRunner.run(PipesMapRunner.java:68)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:214)
13/06/09 20:17:03 INFO mapred.JobClient:  map 0% reduce 0%
13/06/09 20:17:03 INFO mapred.JobClient: Job complete: job_local_0001
13/06/09 20:17:03 INFO mapred.JobClient: Counters: 0
13/06/09 20:17:03 INFO mapred.JobClient: Job Failed: NA
Exception in thread "main" java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1327)
    at org.apache.hadoop.mapred.pipes.Submitter.runJob(Submitter.java:248)
    at org.apache.hadoop.mapred.pipes.Submitter.run(Submitter.java:479)
    at org.apache.hadoop.mapred.pipes.Submitter.main(Submitter.java:494)

Does anyone see where the error hides? What is a straightforward way for debugging Hadoop Pipes programs?

Thanks!

4

2 回答 2

1

可能是因为您的集群在本地模式下运行。您的mapred-site.xml文件中有以下属性吗?

 <property>
   <name>mapreduce.framework.name</name>
   <value>yarn</value>
   <description>
    Let the MapReduce jobs run with the yarn framework.
   </description>
 </property>

如果您没有此属性,则默认情况下,您的集群将以本地模式运行。我曾经在本地模式下遇到完全相同的问题。添加此属性后,集群将以分​​布式模式运行,问题将消失。

高温下,

舒敏

于 2014-03-03T23:43:57.963 回答
1

例外:

at org.apache.hadoop.mapred.pipes.Application.<init>(Application.java:103)

是由源中的以下几行引起的:

//Add token to the environment if security is enabled
Token<JobTokenIdentifier> jobToken = TokenCache.getJobToken(conf
    .getCredentials());
// This password is used as shared secret key between this application and
// child pipes process
byte[]  password = jobToken.getPassword();

实际的 NPE 在最后一行被抛出jobToken为空。

当您使用本地模式(本地作业跟踪器和本地文件系统)时,我不确定是否应该“启用”安全性 - 您是否在 core-site.xml 或 hdfs 中配置了以下属性之一site.xml 配置文件(如果是,它们的值是什么):

  • hadoop.security.authentication
  • hadoop.security.authorization
于 2013-06-09T21:36:55.373 回答