0

我正在尝试使用 distcp 命令将数据从一个 cdh(CDH4.7.1) 集群移动到另一个 cdh(cdh5.4.1) 集群,如下所示:

 hadoop  distcp -D mapred.task.timeout=60000000  -update     hdfs://namenodeIp of source(CDH4):8020/user/admin/distcptest1 webhdfs://namenodeIp of target(CDH5):50070/user/admin/testdir

使用此命令,目录和子目录从源集群 cdh4 复制到目标集群 cdh5,但源集群中的文件没有被复制到目标集群,失败并出现以下错误:

无法将 tmp 文件 (=webhdfs://10.10.200.221:50070/user/admin/testdir/_distcp_tmp_g79i9w/distcptest1/account.xlsx) 重命名为目标文件 (=webhdfs://10.10.200.221:50070/user/admin/ testdir/distcptest1/account.xlsx)

在该作业的日志中找到的堆栈跟踪如下:

2016-02-19 03:16:57,006 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2016-02-19 03:16:58,686 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id
2016-02-19 03:16:58,693 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2016-02-19 03:16:59,736 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2016-02-19 03:16:59,752 INFO org.apache.hadoop.mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@715f1f9c
2016-02-19 03:17:00,248 INFO org.apache.hadoop.mapred.MapTask: Processing split: hdfs://n1.quadratics.com:8020/user/admin/.stagingdistcp_g79i9w/_distcp_src_files:0+2443
2016-02-19 03:17:00,345 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
2016-02-19 03:17:00,353 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
2016-02-19 03:17:01,098 INFO org.apache.hadoop.tools.DistCp: FAIL distcptest1/account.xlsx : java.io.IOException: Fail to rename tmp file (=webhdfs://10.10.200.221:50070/user/admin/testdir/_distcp_tmp_g79i9w/distcptest1/account.xlsx) to destination file (=webhdfs://10.10.200.221:50070/user/admin/testdir/distcptest1/account.xlsx)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.rename(DistCp.java:494)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:463)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:549)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:316)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.io.IOException
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.rename(DistCp.java:490)
... 11 more

2016-02-19 03:17:10,457 INFO org.apache.hadoop.tools.DistCp: FAIL distcptest1/_distcp_logs_ww86cq/_logs/history/job_201602160057_0105_1455872921915_hdfs_distcp : java.io.IOException: Fail to rename tmp file (=webhdfs://10.10.200.221:50070/user/admin/testdir/_distcp_tmp_g79i9w/distcptest1/_distcp_logs_ww86cq/_logs/history/job_201602160057_0105_1455872921915_hdfs_distcp) to destination file (=webhdfs://10.10.200.221:50070/user/admin/testdir/distcptest1/_distcp_logs_ww86cq/_logs/history/job_201602160057_0105_1455872921915_hdfs_distcp)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.rename(DistCp.java:494)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:463)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:549)
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:316)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.io.IOException
at org.apache.hadoop.tools.DistCp$CopyFilesMapper.rename(DistCp.java:490)
... 11 more

即使在使用此命令后也出现上述错误:

 hadoop  distcp -D mapred.task.timeout=60000000  -update     webhdfs://namenodeIp of source(CDH4):50070/user/admin/distcptest1 webhdfs://namenodeIp of target(CDH5):50070/user/admin/testdir

两个集群中都启用了 WebHDFS。

关于 distcp 命令的执行,我是从我的源集群 cdh4 执行的,用户为“admin”,并且可以根据下面给出的 cloudera 链接执行此操作:

http://www.cloudera.com/documentation/enterprise/5-4-x/topics/cdh_admin_distcp_data_cluster_migrate.html

当我从源集群监视目标集群文件时,没有将目标集群文件写入目标集群中由 distcp 创建的临时文件夹。这就是目标集群中重命名失败的原因,因为目标路径不包含该文件。有人可以告诉为什么文件写作失败?

我在stackoverflow上搜索了相关帖子并尝试了这些解决方案,但没有一个解决不了这个问题。任何解决这个问题的想法都会有很大帮助。

4

1 回答 1

1

HDFS 是无法运行纱线作业的用户,它很可能是您的 YARN 配置中的被禁止用户。

如果这是一个安全集群,您还需要两个 kerberos 域之间的信任。

于 2016-02-19T12:31:21.297 回答