1

我在 Eclipse 中的本地编写了一个示例 java spark sql 代码,以从远程 databricks 数据库表中读取数据,如下所示。我已经设置了 hadoop_home 并包含了 spark jdbc 驱动程序,但是每次运行我仍然低于错误。

static final String DB_URL = "jdbc:spark://<databricks-url>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/0289-192234-bars27;AuthMech=3;";
static final String USER = "token";
static final String PASS = "<personal access token>";
static final String QUERY = "select * from concept";

SparkSession spark = SparkSession.builder()
                                 .master("local")
                                 .config("spark.driver.host", "localhost")
                                 .appName("Java Spark SQL basic example")
                                 .getOrCreate();
Dataset<Row> jdbcDF = spark.read()
                           .format("jdbc")
                           .option("url", DB_URL)
                           .option("query", QUERY)
                           .option("user", USER)
                           .option("password", PASS)
                           .load()
System.out.println("Total row count: " + jdbcDF.count());

当我运行上面的代码时,我得到以下错误

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/09/09 17:40:30 INFO SparkContext: Running Spark version 3.1.2
21/09/09 17:40:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/09/09 17:40:31 INFO ResourceUtils: ==============================================================
21/09/09 17:40:31 INFO ResourceUtils: No custom resources configured for spark.driver.
21/09/09 17:40:31 INFO ResourceUtils: ==============================================================
21/09/09 17:40:31 INFO SparkContext: Submitted application: Java Spark SQL basic example
21/09/09 17:40:31 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
21/09/09 17:40:31 INFO ResourceProfile: Limiting resource is cpu
21/09/09 17:40:31 INFO ResourceProfileManager: Added ResourceProfile id: 0
21/09/09 17:40:31 INFO SecurityManager: Changing view acls to: xyz
21/09/09 17:40:31 INFO SecurityManager: Changing modify acls to: xyz
21/09/09 17:40:31 INFO SecurityManager: Changing view acls groups to: 
21/09/09 17:40:31 INFO SecurityManager: Changing modify acls groups to: 
21/09/09 17:40:31 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(xyz); groups with view permissions: Set(); users  with modify permissions: Set(xyz); groups with modify permissions: Set()
21/09/09 17:40:35 INFO Utils: Successfully started service 'sparkDriver' on port 57079.
21/09/09 17:40:35 INFO SparkEnv: Registering MapOutputTracker
21/09/09 17:40:35 INFO SparkEnv: Registering BlockManagerMaster
21/09/09 17:40:35 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
21/09/09 17:40:35 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
21/09/09 17:40:35 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
21/09/09 17:40:35 INFO DiskBlockManager: Created local directory at D:\Users\xyz\AppData\Local\Temp\blockmgr-jhwshs-553a-472f-a3a3-dhgjhadasfasf
21/09/09 17:40:35 INFO MemoryStore: MemoryStore started with capacity 912.3 MiB
21/09/09 17:40:35 INFO SparkEnv: Registering OutputCommitCoordinator
21/09/09 17:40:36 INFO Utils: Successfully started service 'SparkUI' on port 4040.
21/09/09 17:40:36 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://localhost:4040
21/09/09 17:40:37 INFO Executor: Starting executor ID driver on host DESKTOP.am.corp.company.com
21/09/09 17:40:37 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57092.
21/09/09 17:40:37 INFO NettyBlockTransferService: Server created on localhost:57092
21/09/09 17:40:37 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
21/09/09 17:40:37 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, localhost, 57092, None)
21/09/09 17:40:37 INFO BlockManagerMasterEndpoint: Registering block manager localhost:57092 with 912.3 MiB RAM, BlockManagerId(driver, localhost, 57092, None)
21/09/09 17:40:37 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, localhost, 57092, None)
21/09/09 17:40:37 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, localhost, 57092, None)
21/09/09 17:40:39 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/D:/ARM_BI/Projects/DBTest/spark-warehouse').
21/09/09 17:40:39 INFO SharedState: Warehouse path is 'file:/D:/ARM_BI/Projects/DBTest/spark-warehouse'.
21/09/09 17:40:49 WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped
java.sql.SQLException: [Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'unresolvedextractvalue'
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:623)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:468)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:134)
    at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:77)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:47)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:468)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:463)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:477)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'unresolvedextractvalue'
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:52)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:51)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:204)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:2215)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:2171)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:349)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:81)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:349)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$1(TreeNode.scala:346)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:415)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:251)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:413)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:366)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:346)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.expandStarExpression(Analyzer.scala:2171)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.$anonfun$buildExpandedProjectList$1(Analyzer.scala:2156)
    at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
    at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
    at scala.collection.immutable.List.flatMap(List.scala:355)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveReferences$$buildExpandedProjectList(Analyzer.scala:2151)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$13.applyOrElse(Analyzer.scala:1898)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$13.applyOrElse(Analyzer.scala:1893)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUp$5(AnalysisHelper.scala:94)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:81)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUp$1(AnalysisHelper.scala:94)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:225)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp(AnalysisHelper.scala:86)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp$(AnalysisHelper.scala:84)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:29)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.apply(Analyzer.scala:1893)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.apply(Analyzer.scala:1718)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:219)
    at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
    at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
    at scala.collection.immutable.List.foldLeft(List.scala:89)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:216)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:208)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:208)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:248)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:242)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:204)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:186)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:186)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:225)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:232)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:224)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:96)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:132)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:176)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:843)
    at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:176)
    at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:97)
    at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:94)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:86)
    at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:102)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:843)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:100)
    at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:678)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:843)
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:673)
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:566)
    ... 16 more
, Query: SELECT `SPARK_GEN_SUBQ_0`.`concept_id`, `SPARK_GEN_SUBQ_0`.`concept_name`, `SPARK_GEN_SUBQ_0`.`domain_id`, `SPARK_GEN_SUBQ_0`.`vocabulary_id`, `SPARK_GEN_SUBQ_0`.`concept_class_id`, `SPARK_GEN_SUBQ_0`.`invalid_reason` FROM (SELECT * FROM `rwd_omop_vocabulary_v5`.`concept` `rwd_omop_vocabulary_v5_concept`) `SPARK_GEN_SUBQ_0` WHERE 1=0.
    at com.simba.spark.hivecommon.api.HS2Client.pollForOperationCompletion(Unknown Source)
    at com.simba.spark.hivecommon.api.HS2Client.executeStatementInternal(Unknown Source)
    at com.simba.spark.hivecommon.api.HS2Client.executeStatement(Unknown Source)
    at com.simba.spark.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeQuery(Unknown Source)
    at com.simba.spark.hivecommon.dataengine.HiveJDBCDSIExtQueryExecutor.execute(Unknown Source)
    at com.simba.spark.jdbc.common.SPreparedStatement.executeWithParams(Unknown Source)
    at com.simba.spark.jdbc.common.SPreparedStatement.executeQuery(Unknown Source)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:61)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation$.getSchema(JDBCRelation.scala:226)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:35)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355)
    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325)
    at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:307)
    at scala.Option.getOrElse(Option.scala:189)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:307)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:225)
Caused by: com.simba.spark.support.exceptions.GeneralException: [Simba][SparkJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'unresolvedextractvalue'
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:623)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:468)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:134)
    at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:77)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:47)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:468)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:463)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:477)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.sql.AnalysisException: Invalid usage of '*' in expression 'unresolvedextractvalue'
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:52)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:51)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:204)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:2215)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$expandStarExpression$1.applyOrElse(Analyzer.scala:2171)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$2(TreeNode.scala:349)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:81)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:349)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUp$1(TreeNode.scala:346)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$mapChildren$1(TreeNode.scala:415)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:251)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:413)
    at org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:366)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:346)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.expandStarExpression(Analyzer.scala:2171)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.$anonfun$buildExpandedProjectList$1(Analyzer.scala:2156)
    at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
    at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
    at scala.collection.immutable.List.flatMap(List.scala:355)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveReferences$$buildExpandedProjectList(Analyzer.scala:2151)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$13.applyOrElse(Analyzer.scala:1898)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$$anonfun$apply$13.applyOrElse(Analyzer.scala:1893)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUp$5(AnalysisHelper.scala:94)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:81)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUp$1(AnalysisHelper.scala:94)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:225)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp(AnalysisHelper.scala:86)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUp$(AnalysisHelper.scala:84)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUp(LogicalPlan.scala:29)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.apply(Analyzer.scala:1893)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences$.apply(Analyzer.scala:1718)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:219)
    at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
    at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
    at scala.collection.immutable.List.foldLeft(List.scala:89)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:216)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:208)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:208)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:248)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:242)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:204)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:186)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:186)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:225)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:232)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:224)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:96)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:132)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:176)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:843)
    at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:176)
    at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:97)
    at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:94)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:86)
    at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:102)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:843)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:100)
    at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:678)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:843)
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:673)
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:566)
    ... 16 more
, Query: SELECT `SPARK_GEN_SUBQ_0`.`concept_id`, `SPARK_GEN_SUBQ_0`.`concept_name`, `SPARK_GEN_SUBQ_0`.`domain_id`, `SPARK_GEN_SUBQ_0`.`vocabulary_id`, `SPARK_GEN_SUBQ_0`.`concept_class_id`, `SPARK_GEN_SUBQ_0`.`invalid_reason` FROM (SELECT * FROM `rwd_omop_vocabulary_v5`.`concept` `rwd_omop_vocabulary_v5_concept`) `SPARK_GEN_SUBQ_0` WHERE 1=0.
    ... 16 more
21/09/09 17:41:30 INFO SparkContext: Invoking stop() from shutdown hook
21/09/09 17:41:30 INFO SparkUI: Stopped Spark web UI at http://localhost:4040
21/09/09 17:41:30 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
21/09/09 17:41:30 INFO MemoryStore: MemoryStore cleared
21/09/09 17:41:30 INFO BlockManager: BlockManager stopped
21/09/09 17:41:30 INFO BlockManagerMaster: BlockManagerMaster stopped
21/09/09 17:41:30 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
21/09/09 17:41:30 INFO SparkContext: Successfully stopped SparkContext
21/09/09 17:41:30 INFO ShutdownHookManager: Shutdown hook called
21/09/09 17:41:30 INFO ShutdownHookManager: Deleting directory D:\Users\xyz\AppData\Local\Temp\spark-jsdaskjaf-d5db-42bc-b8b4-ashdgashfas

有人可以让我知道如何解决这个问题吗?

4

1 回答 1

0

您提供的查询用于生成select ... from (your_query)语句的子查询。看起来你不能*在那里使用,你需要明确指定列。

但是,如果您正在这样做,那么使用该参数select * from table_name会更容易(请参阅文档):dbtable

Dataset<Row> jdbcDF = spark.read()
                           .format("jdbc")
                           .option("url", DB_URL)
                           .option("dbtable", "concept")
                           .option("user", USER)
                           .option("password", PASS)
                           .load()
于 2021-09-10T12:59:45.563 回答