0

我正在尝试从我的本地 hadoop 伪集群运行 s3distcp。作为执行 s3distcp.jar 的结果,我收到了以下堆栈跟踪。似乎减速器任务失败了,但我无法确定可能导致减速器失败的原因:-

18/02/21 12:14:01 WARN mapred.LocalJobRunner: job_local639263089_0001
java.lang.Exception: java.lang.RuntimeException: Reducer task failed to copy 1 files: file:/home/chirag/workspaces/lzo/data-1518765365022.lzo etc
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:489)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:556)
Caused by: java.lang.RuntimeException: Reducer task failed to copy 1 files: file:/home/chirag/workspaces/lzo/data-1518765365022.lzo etc
    at com.amazon.external.elasticmapreduce.s3distcp.CopyFilesReducer.close(CopyFilesReducer.java:70)
    at org.apache.hadoop.io.IOUtils.cleanup(IOUtils.java:250)
    at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:459)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:346)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
18/02/21 12:14:02 INFO mapreduce.Job: Job job_local639263089_0001 running in uber mode : false
18/02/21 12:14:02 INFO mapreduce.Job:  map 100% reduce 0%
18/02/21 12:14:02 INFO mapreduce.Job: Job job_local639263089_0001 failed with state FAILED due to: NA
18/02/21 12:14:02 INFO mapreduce.Job: Counters: 35
4

1 回答 1

0

我遇到了同样的错误。就我而言,我在 HDFS /var/log/hadoop-yarn/apps/hadoop/logs 中找到了与 s3-dist-cp 启动的 MR 作业相关的日志。

hadoop fs -ls /var/log/hadoop-yarn/apps/hadoop/logs

我将它们复制到本地:

hadoop fs -get /var/log/hadoop-yarn/apps/hadoop/logs/application_nnnnnnnnnnnnn_nnnn/ip-nnn-nn-nn-nnn.ec2.internal_nnnn

然后在文本编辑器中检查它们以找到有关 Reducer 阶段详细结果的更多诊断信息。就我而言,我从 S3 服务收到错误消息。您可能会发现不同的错误。

于 2018-08-06T01:49:15.667 回答