AWS Wrangler 提供了一个方便的接口,用于将 S3 对象用作 pandas 数据帧。我想在获取对象时使用它而不是 boto3 客户端、资源或会话。我还需要使用 SSL 验证。
以下 boto3 客户端代码适用于 SSL Aries 根证书 (!)
import awswrangler as wr
import boto3
import os
aries_cert = os.environ['ARIES_CERT']
s3_session = boto3.Session(
aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
aws_secret_access_key=os.environ["AWS_SECRET_ACCESS_KEY"],
region_name="us-east-1"
)
s3_client = s3_session.client(
service_name="s3",
endpoint_url="https://MY-ENDPOINT.com",
use_ssl=True,
verify=aries_cert,
aws_access_key_id=os.getenv('AWS_ACCESS_KEY_ID'),
aws_secret_access_key=os.getenv('AWS_SECRET_ACCESS_KEY'),
config=botocore.config.Config(
read_timeout=600,
connect_timeout=600,
retries={"max_attempts": 3}
)
)
bucket, prefix = path.split('/', 1)
bucket = bucket if not bucket.startswith('s3://') else bucket.split('s3://')[1]
obj = s3_client.get_object(Bucket=bucket, Key=prefix)
# Do stuff with `obj['Body'].read()`
这个 aws wrangler 代码也可以工作(没有 TLS(SSL?)客户端证书):
import awswrangler as wr
import boto3
import botocore
import os
wr.config.s3_endpoint_url = "https://MY-ENDPOINT.com"
session = boto3.Session(
aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
aws_secret_access_key=os.environ["AWS_SECRET_ACCESS_KEY"],
region_name="us-east-1"
)
path = f's3://{path}' if not path.startswith('s3://') else path
df = wr.s3.read_parquet(
path=path,
dataset=True,
boto3_session=session
)
但是当我包含 TLS(SSL?)客户端证书时,读取失败:
wr.config.botocore_config = botocore.config.Config(
retries={"max_attempts": 3},
connect_timeout=600,
read_timeout=600,
client_cert=os.getenv("ARIES_CERT")
)
df = wr.s3.read_parquet(
path=path,
dataset=True,
boto3_session=session
)
错误信息:
SSLError: SSL 验证失败https://MY-ENDPOINT.com/MY-BUCKET?list-type=2&prefix=MY-PREFIX-BLAH-BLAH.parquet%2F&max-keys=1000&encoding-type=url [SSL] PEM lib (_ssl.c:3524)
知道这里发生了什么吗?我没有找到 aws wrangler 文档,也没有发现 boto3 和 botocore 的文档非常有帮助:
https://aws-data-wrangler.readthedocs.io/en/latest/tutorials/002%20-%20Sessions.html https://aws-data-wrangler.readthedocs.io/en/latest/tutorials/021% 20-%20Global%20Configurations.html#21---全局配置 https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html https://botocore.amazonaws.com /v1/documentation/api/latest/reference/config.html https://botocore.amazonaws.com/v1/documentation/api/latest/tutorial/index.html
以前也有人问过这种问题,如果可以提供关于如何在不同上下文中使用 boto3 客户端、资源和会话的直觉,那将不胜感激。