2

我已经通过 GCP Dataproc Cluster 提交了 Sqoop 作业并将其设置为--as-avrodatafile配置参数,但它失败并出现以下错误:

/08/12 22:34:34 INFO impl.YarnClientImpl: Submitted application application_1565634426340_0021
19/08/12 22:34:34 INFO mapreduce.Job: The url to track the job: http://sqoop-gcp-ingest-mzp-m:8088/proxy/application_1565634426340_0021/
19/08/12 22:34:34 INFO mapreduce.Job: Running job: job_1565634426340_0021
19/08/12 22:34:40 INFO mapreduce.Job: Job job_1565634426340_0021 running in uber mode : false
19/08/12 22:34:40 INFO mapreduce.Job:  map 0% reduce 0%
19/08/12 22:34:45 INFO mapreduce.Job: Task Id : attempt_1565634426340_0021_m_000000_0, Status : FAILED
Error: org.apache.avro.reflect.ReflectData.addLogicalTypeConversion(Lorg/apache/avro/Conversion;)V
19/08/12 22:34:50 INFO mapreduce.Job: Task Id : attempt_1565634426340_0021_m_000000_1, Status : FAILED
Error: org.apache.avro.reflect.ReflectData.addLogicalTypeConversion(Lorg/apache/avro/Conversion;)V
19/08/12 22:34:55 INFO mapreduce.Job: Task Id : attempt_1565634426340_0021_m_000000_2, Status : FAILED
Error: org.apache.avro.reflect.ReflectData.addLogicalTypeConversion(Lorg/apache/avro/Conversion;)V
19/08/12 22:35:00 INFO mapreduce.Job:  map 100% reduce 0%
19/08/12 22:35:01 INFO mapreduce.Job: Job job_1565634426340_0021 failed with state FAILED due to: Task failed task_1565634426340_0021_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

19/08/12 22:35:01 INFO mapreduce.Job: Counters: 11
    Job Counters 
        Failed map tasks=4
        Launched map tasks=4
        Other local map tasks=4
        Total time spent by all maps in occupied slots (ms)=41976
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=13992
        Total vcore-milliseconds taken by all map tasks=13992
        Total megabyte-milliseconds taken by all map tasks=42983424
    Map-Reduce Framework
        CPU time spent (ms)=0
        Physical memory (bytes) snapshot=0
        Virtual memory (bytes) snapshot=0
19/08/12 22:35:01 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
19/08/12 22:35:01 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 30.5317 seconds (0 bytes/sec)
19/08/12 22:35:01 INFO mapreduce.ImportJobBase: Retrieved 0 records.
19/08/12 22:35:01 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@61baa894
19/08/12 22:35:01 ERROR tool.ImportTool: Import failed: Import job failed!
19/08/12 22:35:01 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@10.25.42.52:1521/uataca.aaamidatlantic.com/GCPREADER
Job output is complete

没有指定--as-avrodatafile参数它工作正常。

4

1 回答 1

0

要解决此问题,您需要在提交作业时将mapreduce.job.classloader属性值设置为:true

gcloud dataproc jobs submit hadoop --cluster="${CLUSTER_NAME}" \
    --class="org.apache.sqoop.Sqoop" \
    --properties="mapreduce.job.classloader=true" \
    . . .
    -- \
    --as-avrodatafile \
    . . .
于 2019-08-13T15:59:31.257 回答