如果渲染得足够清晰,大多数文本提取器都应该保持这种结构,但布局可能是许多变化无常的错误树。
在这里,它正确识别了 reaar 的拼写错误,但在 1983 年 5 月 5 日的第一行中失败了
在相同的第二次通过时,失败是不同的
3 29.06.1983 Part of Ground Floor of 05.05.1983 GM315727
2 (part of) Conavon Court 25 years from
1.3.1983
4 31.01.1984 Part of Third Floor Conavon 30.12.1983 GM335793
4 (part of) Court 25 years from
12.8.1983
5 19.04.1984 I?art of Basement Floor of 23.01.1984 GM342693
l (part of), 2 Conavon C:ourt 25 years from
(part of), 3 20.01.1984
(part Of ) , 4
(part of)
NOTE: The Lease also grants a right of way for the purpose only of
loading and unloading and reserves a right of way in case of emergency
only from the boiler house adjacent hereto
6 14.06.1984 Part of Third Floor Conavon 31.10.1983 GM347623
3 (part of) Court 25 years from
31.10.1983
7 14.06.1984 Part of the Third Floor 31.10.1983 GM347623
3 (part: of}, 4 Conavon Court 25 years from
(part of) 31.10.1983
8 01.10.1984 "The Italian Stallion'' 17.08.1984 GM357142
4 (part of) Conavon Court (Basement) 25 years from
20.1.1984
NOTE: The Lease also grants a right of way for the purpose only of
loading and unloading and a right of access through the security door
at the reaar of the building
9 06.07.2016 3rd floor 14-16 Blackfriars 28.06.2016
4 (part of}, 5 Streec 5 years from
(part of) 25/06/2016
这就是 OCR 的美妙之处,每次运行每个字符的通过率都可能不同,因此经验表明使用三个估计中的最佳值。因此运行 3 种不同的方式并逐个字符进行比较,以保持一致。