0

我是 Cloud Data Fusion 的新手,正在尝试将 SQL Server 数据库中的表映射到 MySQL 数据库。我已经面临许多我设法解决的问题,即:

  • 固定服务帐户的权限,使其可以访问所需的所有资源;
  • 将 IP 添加到我的 SQL Server 中允许的连接;
  • 我正在使用 system.profile.properties.dataproc:dataproc.conscrypt.provider.enable = false 来防止另一个问题中报告的 SSL 错误问题。

在最后一次修复之后,我现在尝试在 io.cdap.cdap.internal.app.runtime.ProgramControllerServiceAdapter#97-MapReduceRunner-phase-1 处理 MapReduce 作业的 NULL 指针异常。

Data Fusion提供的stacktrace如下:

java.lang.NullPointerException: null
at io.cdap.plugin.db.batch.source.AbstractDBSource.loadSchemaFromDB(AbstractDBSource.java:138) ~[na:na]
at io.cdap.plugin.db.batch.source.AbstractDBSource.loadSchemaFromDB(AbstractDBSource.java:155) ~[na:na]
at io.cdap.plugin.db.batch.source.AbstractDBSource.prepareRun(AbstractDBSource.java:241) ~[na:na]
at io.cdap.plugin.db.batch.source.AbstractDBSource.prepareRun(AbstractDBSource.java:68) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambda$prepareRun$0(WrappedBatchSource.java:51) ~[na:na]
at io.cdap.cdap.etl.common.plugin.Caller$1.call(Caller.java:30) ~[na:na]
at io.cdap.cdap.etl.common.plugin.StageLoggingCaller.call(StageLoggingCaller.java:40) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:50) ~[na:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.prepareRun(WrappedBatchSource.java:36) ~[na:na]
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.lambda$prepareRun$2(SubmitterPlugin.java:71) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext$2.run(AbstractContext.java:551) ~[na:na]
at io.cdap.cdap.data2.transaction.Transactions$CacheBasedTransactional.finishExecute(Transactions.java:224) ~[na:na]
at io.cdap.cdap.data2.transaction.Transactions$CacheBasedTransactional.execute(Transactions.java:211) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:546) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:534) ~[na:na]
at io.cdap.cdap.etl.common.submit.SubmitterPlugin.prepareRun(SubmitterPlugin.java:69) ~[na:na]
at io.cdap.cdap.etl.batch.PipelinePhasePreparer.prepare(PipelinePhasePreparer.java:111) ~[na:na]
at io.cdap.cdap.etl.batch.mapreduce.MapReducePreparer.prepare(MapReducePreparer.java:97) ~[na:na]
at io.cdap.cdap.etl.batch.mapreduce.ETLMapReduce.initialize(ETLMapReduce.java:192) ~[na:na]
at io.cdap.cdap.api.mapreduce.AbstractMapReduce.initialize(AbstractMapReduce.java:109) ~[na:na]
at io.cdap.cdap.api.mapreduce.AbstractMapReduce.initialize(AbstractMapReduce.java:32) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1.initialize(MapReduceRuntimeService.java:182) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$1.initialize(MapReduceRuntimeService.java:177) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.lambda$initializeProgram$1(AbstractContext.java:640) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.execute(AbstractContext.java:600) ~[na:na]
at io.cdap.cdap.internal.app.runtime.AbstractContext.initializeProgram(AbstractContext.java:637) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.beforeSubmit(MapReduceRuntimeService.java:547) ~[na:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService.startUp(MapReduceRuntimeService.java:226) ~[na:na]
at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:47) ~[com.google.guava.guava-13.0.1.jar:na]
at io.cdap.cdap.internal.app.runtime.batch.MapReduceRuntimeService$2$1.run(MapReduceRuntimeService.java:450) [na:na]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]

任何帮助将不胜感激。

谢谢。

PS:解决了这个问题后,我现在遇到了这个问题:

java.lang.ClassCastException: io.cdap.plugin.db.DBRecord cannot be cast to io.cdap.plugin.db.DBRecord
at io.cdap.plugin.db.batch.source.AbstractDBSource.transform(AbstractDBSource.java:267) ~[database-commons-1.2.0.jar:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.lambda$transform$2(WrappedBatchSource.java:69) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.Caller$1.call(Caller.java:30) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.StageLoggingCaller.call(StageLoggingCaller.java:40) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.transform(WrappedBatchSource.java:68) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.plugin.WrappedBatchSource.transform(WrappedBatchSource.java:36) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.common.TrackedTransform.transform(TrackedTransform.java:74) ~[cdap-etl-core-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:44) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.UnwrapPipeStage.consumeInput(UnwrapPipeStage.java:32) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.PipeStage.consume(PipeStage.java:44) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.PipeTransformExecutor.runOneIteration(PipeTransformExecutor.java:43) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.mapreduce.TransformRunner.transform(TransformRunner.java:142) ~[cdap-etl-batch-6.0.1.jar:na]
at io.cdap.cdap.etl.batch.mapreduce.ETLMapReduce$ETLMapper.map(ETLMapReduce.java:230) ~[cdap-etl-batch-6.0.1.jar:na]
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) [hadoop-mapreduce-client-core-2.8.5.jar:na]
at io.cdap.cdap.internal.app.runtime.batch.MapperWrapper.run(MapperWrapper.java:135) [na:na]
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) [hadoop-mapreduce-client-core-2.8.5.jar:na]
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) [hadoop-mapreduce-client-core-2.8.5.jar:na]
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) [hadoop-mapreduce-client-app-2.8.5.jar:na]
at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_212]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_212]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) [hadoop-common-2.8.5.jar:na]
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) [hadoop-mapreduce-client-app-2.8.5.jar:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_212]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_212]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_212]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_212]
at io.cdap.cdap.internal.app.runtime.batch.distributed.MapReduceContainerLauncher.launch(MapReduceContainerLauncher.java:114) [io.cdap.cdap.cdap-app-fabric-6.0.1.jar:na]
at org.apache.hadoop.mapred.YarnChild.main(Unknown Source) [hadoop-mapreduce-client-app-2.8.5.jar:na]

PS 2:解决上述问题后,我现在可以迁移表了。但是,我有时会收到以下堆栈跟踪作为警告,然后强制作业结束。在实际失败之前,工作会重复自己(不知道这是否是默认行为)。此外,它似乎要么无法将这么多行写入目标数据库,要么连接丢失。这使我无法迁移特定的表。知道为什么吗?

com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:404) at com.mysql.jdbc.Util.getInstance(Util.java:387) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:917) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:896) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:885) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860) at com.mysql.jdbc.ConnectionImpl.throwConnectionClosedException(ConnectionImpl.java:1246) at com.mysql.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:1241) at com.mysql.jdbc.ConnectionImpl.rollback(ConnectionImpl.java:4564) at io.cdap.plugin.db.batch.sink.ETLDBOutputFormat$1.close(ETLDBOutputFormat.java:90) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670) at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:2021) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:797) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at io.cdap.cdap.internal.app.runtime.batch.distributed.MapReduceContainerLauncher.launch(MapReduceContainerLauncher.java:114) at org.apache.hadoop.mapred.YarnChild.main(Unknown Source)

谢谢!

4

2 回答 2

1

您使用的是来自 Hub 的 1.2.0 版 SQL Server 数据库插件吗?

您是否在 SQL Server 属性中指定导入查询?如果没有,请尝试指定导入查询:

SELECT * FROM <table name> WHERE $CONDITION

注意:仅指定WHERE $CONDITION如果要生成的拆分数大于 1。

于 2019-07-11T22:22:37.893 回答
0

对于第二个错误,您遇到了类加载错误:https ://issues.cask.co/browse/CDAP-15636

要解决此问题,请尝试使用通用数据库源和接收器,而不是产品特定数据库。配置应该基本相同。

于 2019-07-12T21:29:04.270 回答