I am trying to connect to Hive using Python (PyHive Lib) to read some data and then I further connects it to hive Flask to show in Dashboard.
It all works fine for few calls to hive, however soon after that I am getting following error.
Traceback (most recent call last):
File "libs/hive.py", line 63, in <module>
cur = h.connect().cursor()
File "libs/hive.py", line 45, in connect
kerberos_service_name='hive')
File "/home1/igns/git/emsr/.venv/lib/python2.7/site-packages/pyhive/hive.py", line 94, in connect
return Connection(*args, **kwargs)
File "/home1/igns/git/emsr/.venv/lib/python2.7/site-packages/pyhive/hive.py", line 192, in __init__
self._transport.open()
File "/home1/igns/git/emsr/.venv/lib/python2.7/site-packages/thrift_sasl/__init__.py", line 79, in open
message=("Could not start SASL: %s" % self.sasl.getError()))
thrift.transport.TTransport.TTransportException: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (No Kerberos credentials available (default cache: FILE:/tmp/krb5cc_cdc995595290_51CD7j))
Following is my code
from pyhive import hive
class Hive(object):
def connect(self):
return hive.connect(host='hive.hadoop-prod.abc.com',
port=10000,
database='temp',
username='gaurang.shah',
auth='KERBEROS',
kerberos_service_name='hive')
if __name__ == '__main__':
h = Hive()
cur = h.connect().cursor()
cur.execute("select * from temp.migration limit 1")
res = cur.fetchall()
print res
Calling Script
source .venv/bin/activate
for i in {1..50}
do
python get_hive_data.py
sleep 300
done
Observation When it's working I can see hive in service principal when I do klist however, I don't when I see above error message.
Klist when it's working
Ticket cache: FILE:/tmp/krb5cc_cdc995595290_XyMnhu
Default principal: gaurang.shah@ABC.COM
Valid starting Expires Service principal
12/04/2018 14:37:28 12/05/2018 00:37:28 krbtgt/ABC.COM@ABC.COM
renew until 12/05/2018 14:37:24
12/04/2018 14:39:06 12/05/2018 00:37:28 hive/hive_server.ABC.COM@ABC.COM
renew until 12/05/2018 14:37:24
Klist when it's not working
Ticket cache: FILE:/tmp/krb5cc_cdc995595290_XyMnhu
Default principal: gaurang.shah@ABC.COM
Valid starting Expires Service principal
12/04/2018 14:37:28 12/05/2018 00:37:28 krbtgt/ABC.COM@ABC.COM
renew until 12/05/2018 14:37:24
Update:
So I don't think it's after certain call however, I think it's after certain time. ( I think one hour). I changed the sleep time to 3600 sec and just after first call I started getting error.
This is weird as, ticket for hive/hive_server.ABC.COM@ABC.COM
was valid for 7 days