0

我正在运行 Python 脚本以使用 AWS-S3-Select 工具查询 AWS-S3 存储桶。我正在从 txt 文件中导入一个变量,并希望将其传递给 S3-Select 查询。我还想通过查询整个 S3 目录而不是单个文件来计算所有导入的变量重复次数(在指定列内)。

这是我到目前为止所拥有的:

import boto3
from boto3.session import Session

with open('txtfile.txt', 'r') as myfile:
    variable = myfile.read()

ACCESS_KEY='accessKey'
SECRET_KEY='secredtKey'

session = Session(aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET_KEY)
s3b = session.client('s3')

r = s3b.select_object_content(
    Bucket='s3BucketName',
    Key='directory/fileName',
    ExpressionType='SQL',
    Expression="'select count(*)from S3Object s where s.columnName = %s;', [variable]",
    InputSerialization={'CSV': {"FileHeaderInfo": "Use"}},
    OutputSerialization={'CSV': {}},
)

for event in r['Payload']:
    if 'Records' in event:
        records = event['Records']['Payload'].decode('utf-8')
        print(records)
    elif 'Stats' in event:
        statsDetails = event['Stats']['Details']
        print("Stats details bytesScanned: ")

当我运行这个脚本时,我得到了以下错误:

Traceback (most recent call last):
  File "s3_query.py", line 20, in <module>
    OutputSerialization={'CSV': {}},
  File "/root/anaconda3/lib/python3.6/site-packages/botocore/client.py", line 314, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/root/anaconda3/lib/python3.6/site-packages/botocore/client.py", line 612, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (ParseUnexpectedToken) when     calling the SelectObjectContent operation: Unexpected token found COMMA:',' at line     1, column 67.
4

1 回答 1

0

这条线看起来很奇怪:

Expression="'select count(*)from S3Object s where s.columnName = %s;', [variable]"

这不是正常的 SQL 或 Python 语法。

您可能应该使用:

Expression='select count(*)from S3Object s where s.columnName = %s;' % [variable]
于 2018-08-10T02:23:22.867 回答