0

我正在尝试使用 pyathenajdbc.connect() 连接到 Athena。我通过多重身份验证设置了 AWS 凭证。当我不在连接字符串中包含 AWS 令牌时,我收到以下错误。

athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)

EROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 0d488c0b-1eed-11e7-bad8-711e54af6b73)

当我在连接字符串中包含 AWS 令牌时,出现以下错误 -->

athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION) ERROR: pyathenajdbc.error.DatabaseError: The security token included in the request is invalid. (Service: AmazonAthena; Status Code: 400; Error Code: UnrecognizedClientException; Request ID: 91751051-1eed-11e7-8347-153dfe3d84a6)

有谁知道这里有什么问题??

这是我的整个代码。

from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import jpype
jvm_path = jpype.getDefaultJVMPath()

_current_credentials = Session().get_credentials()
AWS_KEY_ID = _current_credentials.access_key
AWS_SECRET = _current_credentials.secret_key
AWS_SESSION_TOKEN = _current_credentials.token
REGION = "us-east-2"

#athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)

athena_conn = connect(access_key=AWS_KEY_ID, secret_key=AWS_SECRET, token=AWS_SESSION_TOKEN, s3_staging_dir='s3://abc-pqr-xyz/processed/athena-outputs/',region_name=REGION)

cursor = athena_conn.cursor();
query = 'SELECT * FROM xyz.ABC  limit 1;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)
4

3 回答 3

2
from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import os

_current_credentials = Session().get_credentials()

os.environ['AWS_ACCESS_KEY_ID'] = _current_credentials.access_key
os.environ['AWS_SECRET_ACCESS_KEY'] = _current_credentials.secret_key
os.environ['AWS_SESSION_TOKEN'] = _current_credentials.token


athena_conn = connect(s3_staging_dir='s3://your-bucket/',
           region_name='us-west-2',
           aws_credentials_provider_class='com.amazonaws.athena.jdbc.shaded.com.amazonaws.auth.EnvironmentVariableCredentialsProvider')

cursor = athena_conn.cursor();
query = 'SELECT * FROM schema.table_name limit 1;'
cursor.execute(query)
df = as_pandas(cursor)
print(df)
于 2017-04-15T08:20:44.977 回答
2

假设您在 ~/.aws 文件夹下有一个定义了区域的配置文件,您可以使用 Session().region_name

以下工作正常(不必导入操作系统):

from pyathenajdbc import connect
from pyathenajdbc.util import as_pandas
from boto3 import Session
import jpype
jvm_path = jpype.getDefaultJVMPath()

_current_credentials = Session().get_credentials()
AWS_KEY_ID = _current_credentials.access_key
AWS_SECRET = _current_credentials.secret_key
REGION = Session().region_name

athena_conn = connect(access_key=AWS_KEY_ID,
               secret_key=AWS_SECRET,
               s3_staging_dir='path_to_staging_dir',
               region_name=REGION)

cursor = athena_conn.cursor();

query = 'SELECT current_date;'

cursor.execute(query)
df = as_pandas(cursor)
print(df)
于 2017-05-10T19:42:55.697 回答
0

这个问题并不简单,但我猜它与您的凭据有关。您应该进行一些调查:尝试打印您的密钥并验证它们是否有效。

这是我用来输入凭据的替代方法:

import configparser    

aws_config_file = '~/.aws/config'

Config = configparser.ConfigParser()
Config.read(os.path.expanduser(aws_config_file))

access_key_id = Config['default']['aws_access_key_id']
secret_key_id = Config['default']['aws_secret_access_key']

否则,只是为了确保问题与 jdbc 驱动程序无关,粘贴以下命令的输出

import pyathenajdbc 

print(pyathenajdbc.ATHENA_CONNECTION_STRING)
print(pyathenajdbc.ATHENA_DRIVER_CLASS_NAME)
print(pyathenajdbc.ATHENA_DRIVER_DOWNLOAD_URL)
print(pyathenajdbc.ATHENA_JAR)
于 2017-04-12T12:42:42.063 回答