我提取了 Textract AWS 函数返回的数据。此 Textract 函数的返回数据类型为以下类型:
{
"AnalyzeDocumentModelVersion": "string",
"Blocks": [
{
"BlockType": "string",
"ColumnIndex": number,
"ColumnSpan": number,
"Confidence": number,
"EntityTypes": [ "string" ],
"Geometry": {
"BoundingBox": {
"Height": number,
"Left": number,
"Top": number,
"Width": number
},
"Polygon": [
{
"X": number,
"Y": number
}
]
},
"Id": "string",
"Page": number,
"Relationships": [
{
"Ids": [ "string" ],
"Type": "string"
}
],
"RowIndex": number,
"RowSpan": number,
"SelectionStatus": "string",
"Text": "string"
}
],
"DocumentMetadata": {
"Pages": number
},
"JobStatus": "string",
"NextToken": "string",
"StatusMessage": "string",
"Warnings": [
{
"ErrorCode": "string",
"Pages": [ number ]
}
]
}
我通过以下代码从这些数据中提取了块:
var d = null;
...<Some Code Here>...
d = data.Blocks;
console.log(d);
它以 JSON 对象数组的形式提供输出。下面给出了提取文本的示例:
[...{ BlockType: 'WORD',
Confidence: 99.7286376953125,
Text: '2000.00',
Geometry: { BoundingBox: [Object], Polygon: [Array] },
Id: '<ID here>',
Page: 1 }, ...]
我只想提取文本字段并将其视为唯一的输出。我该如何开始呢?