1

有没有人得到“S3 Select” ( https://aws.amazon.com/blogs/aws/s3-glacier-select/,https://aws.amazon.com/about-aws/whats-new/2018/04 /amazon-s3-select-is-now-generally-available/)与boto3(甚至cli或其他sdk)一起工作?我在下面得到了神秘的 InternalError:

在具有 IAM 角色的 EC2 上运行:

[ec2-user@ip-blah bin]$ ./python
Python 2.7.13 (default, Jan 31 2018, 00:17:36)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import boto3
>>> s3 = boto3.client('s3')
>>> r = s3.select_object_content(
...         Bucket='mybucketname',
...         Key='mypath/file.txt',
...         ExpressionType='SQL',
...         Expression="select count(*) from s3object s",
...         InputSerialization = {'CSV': {"FileHeaderInfo": "Use"}},
...         OutputSerialization = {'CSV': {}},
... )
Traceback (most recent call last):
  File "<stdin>", line 7, in <module>
  File "/home/ec2-user/venv/local/lib/python2.7/site-packages/botocore/client.py", line 314, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/ec2-user/venv/local/lib/python2.7/site-packages/botocore/client.py", line 612, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InternalError) when calling the SelectObjectContent operation (reached max retries: 4): We encountered an internal error. Please try again.
4

1 回答 1

2

我的猜测:

  • 检查s3的权限
  • 如果需要在 InputSerialization 上调整 'RecordDelimiter'、'FieldDelimiter'、'QuoteCharacter'
  • 检查 csv 文件的结构(标题的数量与数据列匹配,转义规范。字符,空格,/n 作为新行定义,。)

  • 试试... Expression="SELECT * FROM S3Object s", InputSerialization={'CSV': {}}, OutputSerialization={'CSV': {}}, ...

希望能有所帮助!

于 2018-04-18T11:04:42.843 回答