我有一个tableA
具有以下格式的配置单元表:
> desc tableA;
+--------------------------+-----------------------+-----------------------+--+
| col_name | data_type | comment |
+--------------------------+-----------------------+-----------------------+--+
| statementid | string | |
| batchid | string | |
| requestparam | map<string,string> | |
+--------------------------+-----------------------+-----------------------+--+
我尝试使用以下代码加载数据库:
val tempdf= spark.read.format("jdbc")
.option("driver", "org.apache.hive.jdbc.HiveDriver")
.option("url", "jdbc:hive2://localhost:10000/tempdb")
.option("user","user1")
.option("password","password1")
.option("query","select statementid, batchid, requestparam from tempdb.tableA")
.load()
我的第二次尝试:
val tempdf = spark.read.format("jdbc")
.option("driver", "org.apache.hive.jdbc.HiveDriver")
.option("url", "jdbc:hive2://localhost:10000/tempdb")
.option("user","user1")
.option("password","password1")
.option("dbtable","tempdb.tableA")
.load()
但是map<string,string>
在将源配置单元表加载到 spark 数据集中时,列会导致问题。
线程“主”java.sql.SQLException 中的异常:org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getCatalystType(JdbcUtils.scala:247) 中的 org.apache.spark.sql 中不支持类型 JAVA_OBJECT。 execution.datasources.jdbc.JdbcUtils$.$anonfun$getSchema$1(JdbcUtils.scala:312) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils $.getSchema(JdbcUtils.scala:312) 在 org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:63) 在 org.apache.spark.sql.execution.datasources.jdbc .JDBCRelation$.getSchema(JDBCRelation.scala:226) 在 org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35) 在 org.apache.spark.sql.execution.datasources。 DataSource.resolveRelation(DataSource.scala:354) 在 org.apache.spark。sql.DataFrameReader.loadV1Source(DataFrameReader.scala:326) at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:308) at scala.Option.getOrElse(Option.scala:189) at org .apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:308) 在 org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:226)