全部。我想使用 Python 连接到 HDInsight 中的 Hive 数据库我关注了多个博客和一些Stackoverflow 博客。但是没有运气。下面是我使用pyhive和 JayDeBeApi库的尝试。
使用 JayDeBeApi
我已将 hive-jdbc-1.2.1、httpclient-4.4 和 httpcore-4.4.4 jar 添加到当前工作目录,并且已经使用 pip install thrift 安装了 thrift。代码片段是
import jaydebeapi
conn = jaydebeapi.connect("org.apache.hive.jdbc.HiveDriver",
"jdbc:hive2://shaktiman.database.windows.net:443/;ssl=true;transportMode=http;httpPath=/hive2",
['admin', 'Abcdeertyoiu@1234'],
"hive-jdbc-1.2.1.jar")
cursor = conn.cursor()
cursor.execute("select * from default.hivesampletable limit 50")
print(cursor.description) # prints the result set's schema
results = cursor.fetchall()
但我得到以下错误:
Traceback (most recent call last):
File "ClassLoader.java", line 357, in java.lang.ClassLoader.loadClass
File "Launcher.java", line 349, in sun.misc.Launcher$AppClassLoader.loadClass
File "ClassLoader.java", line 424, in java.lang.ClassLoader.loadClass
File "URLClassLoader.java", line 382, in java.net.URLClassLoader.findClass
java.lang.ClassNotFoundException: java.lang.ClassNotFoundException: org.apache.hive.service.cli.thrift.TCLIService$Iface
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "org.jpype.JPypeContext.java", line 330, in org.jpype.JPypeContext.callMethod
File "Method.java", line 498, in java.lang.reflect.Method.invoke
File "DelegatingMethodAccessorImpl.java", line 43, in sun.reflect.DelegatingMethodAccessorImpl.invoke
File "NativeMethodAccessorImpl.java", line 62, in sun.reflect.NativeMethodAccessorImpl.invoke
File "NativeMethodAccessorImpl.java", line -2, in sun.reflect.NativeMethodAccessorImpl.invoke0
File "DriverManager.java", line 247, in java.sql.DriverManager.getConnection
File "DriverManager.java", line 664, in java.sql.DriverManager.getConnection
File "HiveDriver.java", line 105, in org.apache.hive.jdbc.HiveDriver.connect
Exception: Java Exception
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test.py", line 39, in <module>
"hive-jdbc-1.2.1.jar")
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\jaydebeapi\__init__.py", line 412, in connect
jconn = _jdbc_connect(jclassname, url, driver_args, jars, libs)
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\jaydebeapi\__init__.py", line 230, in _jdbc_connect_jpype
return jpype.java.sql.DriverManager.getConnection(url, *dargs)
java.lang.NoClassDefFoundError: java.lang.NoClassDefFoundError: org/apache/hive/service/cli/thrift/TCLIService$Iface
不确定,是什么问题。
我也尝试过使用 PyHive,如下所示
from pyhive import hive
conn = hive.connect('hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net', port=10000,auth='NOSASL')
cursor = conn.cursor()
cursor.execute('SHOW DATABASES')
cursor.fetchall()
但我仍然得到 isuue :
"D:\Learning Dir\PycharmProjects\Python\venv\Scripts\python.exe" "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test2.py"
failed to resolve sockaddr for hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net:10000
Traceback (most recent call last):
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 99, in open
addrs = self._resolveAddr()
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 42, in _resolveAddr
socket.AI_PASSIVE | socket.AI_ADDRCONFIG)
File "D:\Installation\Python\Python38-32\lib\socket.py", line 752, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed
Traceback (most recent call last):
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 99, in open
addrs = self._resolveAddr()
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 42, in _resolveAddr
socket.AI_PASSIVE | socket.AI_ADDRCONFIG)
File "D:\Installation\Python\Python38-32\lib\socket.py", line 752, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 11001] getaddrinfo failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "D:/Learning Dir/PycharmProjects/Python/HdInsight-Hive/test2.py", line 2, in <module>
conn = hive.connect('hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net', port=10000,auth='NOSASL')
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\pyhive\hive.py", line 94, in connect
return Connection(*args, **kwargs)
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\pyhive\hive.py", line 192, in __init__
self._transport.open()
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TTransport.py", line 155, in open
return self.__trans.open()
File "D:\Learning Dir\PycharmProjects\Python\venv\lib\site-packages\thrift\transport\TSocket.py", line 103, in open
raise TTransportException(type=TTransportException.NOT_OPEN, message=msg, inner=gai)
thrift.transport.TTransport.TTransportException: failed to resolve sockaddr for hn0-shaktiman-po.ttl4q3khoz5uvb1d4jopix3kbg.cx.internal.cloudapp.net:10000
此外,很少有博客建议将 hiveserver2 传输模式从“http”更改为“二进制”。试过了。但这对我也没有帮助......
如果有人能提出一些可行的代码或解决方案,我将不胜感激。提前致谢。