我能够以最少的修改运行您的代码片段,如下所示:
import boto3
session = boto3.Session(profile_name='syumaK')
# Any clients created from this session will use credentials
# from the [dev] section of ~/.aws/credentials.
# Document
s3BucketName = 'syumaK-bucket'
documentName = 'document.pdf'
# Amazon Textract client
textract_client = session.client('textract')
# Call Amazon Textract
response = textract_client.start_document_text_detection(
DocumentLocation={
'S3Object': {
'Bucket': 's3BucketName',
'Name': 'documentName'
}
},
JobTag='Receipt',
NotificationChannel={
'SNSTopicArn': 'arn:aws:sns:us-east-1:192xxxxxxxx:AmazonTextractTopic',
'RoleArn': 'arn:aws:iam::192xxxxxxxx:role/AWSTextractRole'
}
)
#print(response)
print(response)
从 Amazon Textract 到 SNS 主题的成功发布消息应输出类似于:
{'JobId': '83111ff3cf665225dde114f796972c3cbf9b663170e778c291666c7eaf57c5d0', 'ResponseMetadata': {'RequestId': 'b3ce4fc9-9a6e-4b9a-ba4b-c849b9763405', 'HTTPStatusCode': 200, 'HTTPHeaders': {'date': 'Sat, 11 Apr 2020 07:48:07 GMT', 'content-type': 'application/x-amz-json-1.1', 'content-length': '76', 'connection': 'keep-alive', 'x-amzn-requestid': 'b3ce4fc9-9a6e-4b9a-ba4b-c849b9763405'}, 'RetryAttempts': 0}}
以下是一些提示,以防其他人遇到此问题:
- 确保您使用 SNSTopicArn 和 RoleArn 的有效值
如果要处理的文档位于 S3 文件夹中,则需要更改参数,如下所示:
DocumentLocation={
'S3Object': {
'Bucket': 's3BucketName',
'Name': 'folder/documentName'
}
},
希望这可以帮助。