0

我正在尝试使用 pyspark 从我的 mac 本地计算机连接到远程服务器中安装的 presto DB,下面是我的代码。我已经下载了 presto 驱动程序并将其放在 /user/name//Hadoop/spark-2.3.1-bin-hadoop2.7/jars 下(我想这是我犯错的地方,但不确定)

from pyspark.sql import SparkSession, HiveContext
from pyhive import presto, hive


def main():

    spark = SparkSession.builder\
        .appName("tests")\
        .enableHiveSupport()\
        .getOrCreate()
        
   df_presto = spark.read.format("jdbc") \
          .option("driver", "io.prestosql.jdbc.PrestoDriver")\
          .option("url", "jdbc:presto://host.com:443/hive") \
          .option("user", "user_name")\
          .option("password", "password") \
          .option("dbtable", "(select column from table_name limit 10) tmp") \
          .load()

Preso 驱动程序:presto-jdbc-340.jar

当我尝试执行代码时,出现如下错误

 Traceback (most recent call last):
  File "/Users/user_name/Hadoop/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
  File "/Users/user_name/Hadoop/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o38.load.
: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.IllegalArgumentException: java.net.UnknownHostException: ip-10-120-99-149.ec2.internal;
    at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
    at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:194)

知道如何解决这个问题吗?

4

0 回答 0