问题标签 [amazon-textract]

问问题

For questions regarding programming in ECMAScript (JavaScript/JS) and its various dialects/implementations (excluding ActionScript). Note JavaScript is NOT the same as Java! Please include all relevant tags on your question; e.g., [node.js], [jquery], [json], [reactjs], [angular], [ember.js], [vue.js], [typescript], [svelte], etc.

163 问题

0 投票

3 回答

1753 浏览

python - AWS 使用 texttract 开始文档分析不起作用

我正在为我的学校做一个项目，我应该使用 textract 对表单进行文档分析并将该输出运行到 A2I，其中算法将确定表单是否被批准、拒绝或需要审查。将文档上传到 S3 后，应触发此 textract lambda 函数。但是，当我遵循本文档时，我会遇到语法错误；https://docs.aws.amazon.com/textract/latest/dg/API_StartDocumentAnalysis.html

我的代码是：

代码尚未完成，但我已经收到语法错误：

2020-07-14T14:48:14.197

0 投票

0 回答

355 浏览

python - AWS 获取 AWS 文本代码上的关键错误。我应该怎么办？

这是我从日志中得到的错误：

我正在尝试通过 textract 运行表单分析以从表单中提取数据并将其作为 .csv 文件保存到 S3 中。我的代码如下：

我关注的文档是：https ://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/textract.html#Textract.Client.get_document_analysis

任何建议将不胜感激！谢谢

python amazon-web-services aws-lambda amazon-textract

2020-07-15T06:24:21.553

0 投票

1 回答

450 浏览

python - 如何将 AWS Textract 与 Python 结合使用

我已经测试了几乎所有在 Internet 上可以找到的 Amazon Textract 示例代码，但我无法让它工作。我可以从我的 Python 客户端上传和下载文件到 S3，因此凭据应该没问题。许多错误指向某些区域故障，但我已经尝试了所有可能的组合。

这是最后一个测试电话之一 -

似乎很容易，但它会产生错误 -

任何想法有什么问题并且有人有一个工作示例（我知道示例代码中的选项卡不正确）？

我还在 AWS 中测试了很多权限设置。凭据位于 aws sdk 创建的隐藏文件中。

python amazon-textract

2020-07-28T17:15:06.383

0 投票

1 回答

2098 浏览

python - InvalidS3ObjectException：无法从 S3 获取对象元数据？

所以我尝试使用Amazon Textract读取多个 pdf 文件，多个页面使用StartDocumentTextDetection如下方法：

当只是尝试从中检索响应对象时s3，我可以看到它打印为：

相应地，我s3_file.key稍后使用它来访问对象。但我收到以下我无法弄清楚的错误：

InvalidS3ObjectException：调用 StartDocumentTextDetection 操作时发生错误 (InvalidS3ObjectException)：无法从 S3 获取对象元数据。检查对象密钥、区域和/或访问权限。

到目前为止，我有：

从 boto3 会话检查区域，存储桶和 aws 配置设置都设置为us-east-2.
键不能错，我是直接从对象响应中传递的
权限方面，我检查了 IAM 控制台，并将其设置为AmazonS3FullAccessand AmazonTextractFullAccess。

这里可能出了什么问题？

[编辑]我确实重命名了文件，因此它们没有\\，但似乎它仍然无法正常工作，这很奇怪..

python amazon-web-services amazon-s3 boto3 amazon-textract

2020-08-31T15:27:56.737

0 投票

1 回答

125 浏览

amazon-web-services - Amazon Textract 未读取单击复选框字段

我正在尝试使用 textract 读取附加的 pdf 文件，但它没有将复选框读取为键值对字段。它只是将它们作为原始数据读取。例如，我对第 3 页上的问题 10a 的价值感兴趣。我期望键是“ 10a. Per: (Choose only one)*”，值是复选框单击值。但它仅将其作为原始文本阅读，我无法找到 10a 是单击还是未单击。

有没有人遇到过这个问题？你能告诉我吗

我附上了 AWS 文本图像截图和 PDF 链接

在此处输入图像描述

pdf文件

amazon-web-services pdf checkbox amazon-textract

2020-09-12T08:40:38.470

0 投票

0 回答

769 浏览

amazon-web-services - 内部错误无法从 S3 获取对象元数据。检查 aws Textract awssdk.core 中的对象键、区域和/或访问权限

我正在尝试使用 S3 存储桶运行文档分析请求，但它给了我一个内部错误。我从文档中提取了表值。这是我的代码。请注意并使用适用于 .Net 的 AWS 开发工具包。

错误信息：

内部错误无法从 S3 获取对象元数据。检查 aws Textract awssdk.core 中的对象键、区域和/或访问权限

amazon-web-services amazon-s3 aws-sdk amazon-textract

2020-09-23T07:15:21.327

0 投票

1 回答

2866 浏览

python - Using Textract for OCR locally

I want to extract text from images using Python. (Tessaract lib does not work for me because it requires installation).

I have found boto3 lib and Textract, but I'm having trouble working with it. I'm still new to this. Can you tell me what I need to do in order to run my script correctly.

This is my code:

When I run this code, I get:

I have also tried this:

But I get this error:

Im noob in this, so any help would be good. How can I read text form my image or pdf file?

I have also added this block of code, but the error is still Unable to locate credentials.

python amazon-web-services amazon-textract

2020-09-24T10:57:46.323

0 投票

0 回答

292 浏览

amazon-web-services - 调用AWS Textract API时如何指定需要支付的Key值？

Textract 是否具有预定义的键值对或者是否可以定义“键值对”。

例如，它可以提取键值对，如名称。但是我可以要求它为新键“ABCD”得到相同的结果吗？

如果是的话怎么做？

amazon-web-services ocr key-value text-extraction amazon-textract

2020-09-27T16:08:03.913

0 投票

1 回答

116 浏览

amazon-textract - 无法使用增强型 AI 启动人工循环 - start_human_loop 中的错误

我正在尝试通过一段 python 代码来触发人类工作流程。这包括对 Textract 的人工审查。

代码片段如下：

当我运行它时，发生异常：botocore.exceptions.ParamValidationError: Parameter validation failed: Invalid type for parameter HumanLoopInput

谁能帮我举一个 HumanLoopInput 的例子？它的配置已经在 analyze_document() 函数 (HumanLoopConfig) 中完成。有没有其他方法可以避免这种异常？

amazon-textract

2020-10-05T08:18:59.717

0 投票

1 回答

779 浏览

python - 使用 AWS Texttract 处理 PDF

我想使用 Textract OCR 服务从 pdf 文件中读取文本。我有一个问题，因为我想在本地做，没有 S3 桶。我对图像文件进行了测试，效果很好，但不适用于 PDF 文件。

这是我收到错误的代码：

错误：

代码2：

错误：

代码3：

错误：

我该怎么办，有没有办法让 Textract 为没有 s3 的 PDF 文档工作？

python ocr amazon-textract

2020-10-08T10:52:57.547

1 2 3 4 5 6 7 8 9 10

问题标签 [amazon-textract]

Reference