0

全部。我想使用 Python 连接到 HDInsight 中的 Hive 数据库我关注了多个博客和一些Stackoverflow 博客。但是没有运气。下面是我使用pyhiveJayDeBeApi库的尝试。

使用 JayDeBeApi

我已将 hive-jdbc-1.2.1、httpclient-4.4 和 httpcore-4.4.4 jar 添加到当前工作目录,并且已经使用 pip install thrift 安装了 thrift。代码片段是

import jaydebeapi

conn = jaydebeapi.connect("org.apache.hive.jdbc.HiveDriver",
       "jdbc:hive2://shaktiman.database.windows.net:443/;ssl=true;transportMode=http;httpPath=/hive2",
       ['admin', 'Abcdeertyoiu@1234'],
       "hive-jdbc-1.2.1.jar")

cursor = conn.cursor()
cursor.execute("select * from default.hivesampletable limit 50")
print(cursor.description)  # prints the result set's schema
results = cursor.fetchall()

但我得到以下错误:

Traceback (most recent call last):
  File "ClassLoader.java", line 357, in java.lang.ClassLoader.loadClass
  File "Launcher.java", line 349, in sun.misc.Launcher$AppClassLoader.loadClass
  File "ClassLoader.java", line 424, in java.lang.ClassLoader.loadClass
  File "URLClassLoader.java", line 382, in java.net.URLClassLoader.findClass
java.lang.ClassNotFoundException: java.lang.ClassNotFoundException: org.apache.hive.service.cli.thrift.TCLIService$Iface

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "org.jpype.JPypeContext.java", line 330, in org.jpype.JPypeContext.callMethod
  File "Method.java", line 498, in java.lang.reflect.Method.invoke
  File "DelegatingMethodAccessorImpl.java", line 43, in sun.reflect.DelegatingMethodAccessorImpl.invoke
  File "NativeMethodAccessorImpl.java", line 62, in sun.reflect.NativeMethodAccessorImpl.invoke
  File "NativeMethodAccessorImpl.java", line -2, in sun.reflect.NativeMethodAccessorImpl.invoke0
  File "DriverManager.java", line 247, in java.sql.DriverManager.getConnection
  File "DriverManager.java", line 664, in java.sql.DriverManager.getConnection
  File "HiveDriver.java", line 105, in org.apache.hive.jdbc.HiveDriver.connect
Exception: Java Exception

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test.py", line 39, in <module>
    "hive-jdbc-1.2.1.jar")
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\jaydebeapi\__init__.py", line 412, in connect
    jconn = _jdbc_connect(jclassname, url, driver_args, jars, libs)
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\jaydebeapi\__init__.py", line 230, in _jdbc_connect_jpype
    return jpype.java.sql.DriverManager.getConnection(url, *dargs)
java.lang.NoClassDefFoundError: java.lang.NoClassDefFoundError: org/apache/hive/service/cli/thrift/TCLIService$Iface

不确定,是什么问题。

我也尝试过使用 PyHive,如下所示

from pyhive import hive
conn = hive.connect('hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net', port=10000,auth='NOSASL')
cursor = conn.cursor()
cursor.execute('SHOW DATABASES')
cursor.fetchall()

但我仍然得到 isuue :

"D:\Learning Dir\PycharmProjects\Python\venv\Scripts\python.exe" "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test2.py"
failed to resolve sockaddr for hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net:10000
Traceback (most recent call last):
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 99, in open
    addrs = self._resolveAddr()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 42, in _resolveAddr
    socket.AI_PASSIVE | socket.AI_ADDRCONFIG)
  File "D:\Installation\Python\Python38-32\lib\socket.py", line 752, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed
Traceback (most recent call last):
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 99, in open
    addrs = self._resolveAddr()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 42, in _resolveAddr
    socket.AI_PASSIVE | socket.AI_ADDRCONFIG)
  File "D:\Installation\Python\Python38-32\lib\socket.py", line 752, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test2.py", line 2, in <module>
    conn = hive.connect('hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net', port=10000,auth='NOSASL')
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\pyhive\hive.py", line 94, in connect
    return Connection(*args, **kwargs)
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\pyhive\hive.py", line 192, in __init__
    self._transport.open()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TTransport.py", line 155, in open
    return self.__trans.open()
  File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 103, in open
    raise TTransportException(type=TTransportException.NOT_OPEN, message=msg, inner=gai)
thrift.transport.TTransport.TTransportException: failed to resolve sockaddr for hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net:10000

此外,很少有博客建议将 hiveserver2 传输模式从“http”更改为“二进制”。试过了。但这对我也没有帮助......

如果有人能提出一些可行的代码或解决方案,我将不胜感激。提前致谢。

4

1 回答 1

0

在我看来配置/网络问题。

  1. 您可以验证从主机(提交应用程序的位置)到 HDI 集群的连接(如果它是从 HDI 中的头节点提交的,则可以忽略)。尝试在此处使用 ip 地址- hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.netcurl ifconfig.me您可以通过在 HDI 集群内运行来获取 IP 地址。
  2. 还可以尝试使用telnet. 尝试使用 10001
  3. 尝试在 Ambari中将值hive.server2.transport.mode从更改为httpbinary
于 2020-10-21T06:07:07.423 回答