0

TIDB通过Spark使用连接到时出现以下错误mysql-connector-java 5.1.6 connector

请注意,我使用并行连接选项创建了 jdbc 连接,我们在其中指定列名、下限、上限和分区数。

然后 Spark 通过将列名的下限和上限划分为相等的大小,将其分解为(分区数)查询。

java.sql.SQLException: other error: request outdated.
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1055)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:956)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3536)
at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:1551)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1407)
at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:2861)
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:474)
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:2554)
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:1755)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2165)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2648)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2086)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2237)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:301)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1092)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1083)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1018)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1083)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:809)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
4

1 回答 1

1

other error: request outdated.是由 抛出的错误TiKV,表示查询end-point-request-max-handle-duration在执行前超过了超时限制,并被协处理器取消以避免过时的查询。您可以在 TiKV 的 config 中进行配置,默认值为 60 秒。

由于 Spark 从 JDBC 中检索到此错误,这意味着请求太重而TiDB无法处理,以至于某些请求等待的时间过长。这主要是因为 Spark 对每个分区的请求进行拆分,导致 TiDB 的工作量很大。当您使用并行连接时,情况会变得更糟。

事实上,TiSpark被开发为使用 Spark 和 TiDB 进行查询的解决方案。它现在支持 Spark 2.1,几天后将支持 Spark 2.3。尝试一下!

于 2018-10-08T09:16:23.653 回答