4

我已经安装了 cloudera cdh4 版本并且我正在尝试在其上运行 mapreduce 作业。我收到以下错误-->

2012-07-09 15:41:16 ZooKeeperSaslClient [INFO] Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
2012-07-09 15:41:16 ClientCnxn [INFO] Socket connection established to Cloudera/192.168.0.102:2181, initiating session
2012-07-09 15:41:16 RecoverableZooKeeper [WARN] Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
2012-07-09 15:41:16 RetryCounter [INFO] The 1 times to retry  after sleeping 2000 ms
2012-07-09 15:41:16 ClientCnxn [INFO] Session establishment complete on server Cloudera/192.168.0.102:2181, sessionid = 0x1386b0b44da000b, negotiated timeout = 60000
2012-07-09 15:41:18 TableOutputFormat [INFO] Created table instance for exact_custodian
2012-07-09 15:41:18 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2012-07-09 15:41:18 JobSubmitter [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2012-07-09 15:41:18 JobSubmitter [INFO] Cleaning up the staging area file:/tmp/hadoop-hdfs/mapred/staging/hdfs48876562/.staging/job_local_0001
2012-07-09 15:41:18 UserGroupInformation [ERROR] PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
    at 

我能够运行 hadoop-mapreduce-examples-2.0.0-cdh4.0.0.jar 中给出的示例程序。但是当我的工作成功提交到 jobtracker 时,我收到了这个错误。看起来它正在尝试再次访问本地文件系统(尽管我已经为分布式缓存中的作业执行设置了所有必需的库,但它仍在尝试访问本地目录)。这个问题与用户权限有关吗?

I) Cloudera:~ # hadoop fs -ls hdfs://<MyClusterIP>:8020/显示 -

Found 8 items
drwxr-xr-x   - hbase hbase               0 2012-07-04 17:58 hdfs://<MyClusterIP>:8020/hbase<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/input<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/output<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-07-06 16:03 hdfs:/<MyClusterIP>:8020/tools-lib<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-06-26 14:02 hdfs://<MyClusterIP>:8020/test<br/>
drwxrwxrwt   - hdfs  supergroup          0 2012-06-12 16:13 hdfs://<MyClusterIP>:8020/tmp<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-07-06 15:58 hdfs://<MyClusterIP>:8020/user<br/>

II) --- 没有后续结果 ----

hdfs@Cloudera:/etc/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
hdfs@Cloudera:/etc/hbase/conf> find . -name '**' | xargs grep "default.name"<br/>

相反,我认为我们正在使用新的 APIS ->
fs.defaultFS --> hdfs://Cloudera:8020 我已正确设置

虽然对于“fs.default.name”,我得到了 hadoop 集群 0.20.2(非 cloudera 集群)的条目

cass-hadoop@Pratapgad:~/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
./core-default.xml:  <name>fs.default.name</name><br/>
./core-site.xml:  <name>fs.default.name</name><br/>

我认为 cdh4 默认配置应该在各自的目录中添加这个条目。(如果它的错误)。

我用来运行我的程序的命令 -

hdfs@Cloudera:/home/cloudera/yogesh/lib> java -classpath hbase-tools.jar:hbase.jar:slf4j-log4j12-1.6.1.jar:slf4j-api-1.6.1.jar:protobuf-java-2.4.0a.jar:hadoop-common-2.0.0-cdh4.0.0.jar:hadoop-hdfs-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-common-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-core-2.0.0-cdh4.0.0.jar:log4j-1.2.16.jar:commons-logging-1.0.4.jar:commons-lang-2.5.jar:commons-lang3-3.1.jar:commons-cli-1.2.jar:commons-configuration-1.6.jar:guava-11.0.2.jar:google-collect-1.0-rc2.jar:google-collect-1.0-rc1.jar:hadoop-auth-2.0.0-cdh4.0.0.jar:hadoop-auth.jar:jackson.jar:avro-1.5.4.jar:hadoop-yarn-common-2.0.0-cdh4.0.0.jar:hadoop-yarn-api-2.0.0-cdh4.0.0.jar:hadoop-yarn-server-common-2.0.0-cdh4.0.0.jar:commons-httpclient-3.0.1.jar:commons-io-1.4.jar:zookeeper-3.3.2.jar:jdom.jar:joda-time-1.5.2.jar com.hbase.xyz.MyClassName

4

2 回答 2

4

即使我在运行 MR 作业时在 2.0.0-cdh4.1.3 中分阶段解决了同样的问题。在 mapred.site.xml 中添加属性后

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

用于运行 Hive 作业

export HIVE_USER=yarn
于 2013-02-15T07:29:37.353 回答
2

调试程序:尝试运行简单的 Hadoop shell 命令。

hadoop fs -ls /

如果这显示 HDFS 文件,那么您的配置是正确的。如果没有,则缺少配置。发生这种情况时,hadoop shell 命令 like-ls将引用本地文件系统而不是 Hadoop 文件系统。如果使用 CMS(Cloudera 管理器)启动 Hadoop,就会发生这种情况。它没有将配置显式存储在conf目录中。

通过以下命令(更改端口)检查是否显示 hadoop 文件系统:

hadoop fs -ls hdfs://host:8020/

如果提交路径时显示本地文件系统,/那么您应该设置配置文件hdfs-site.xmlmapred-site.xml配置目录。还hdfs-site.xml应该有fs.default.name指向的条目hdfs://host:port/。在我的情况下,目录是/etc/hadoop/conf.

请参阅:http ://hadoop.apache.org/common/docs/r0.20.2/core-default.html

看看,如果这解决了你的问题。

于 2012-07-06T16:36:27.140 回答