您可以从左到右和从上到下提取文本,使用--psm 6
它告诉 Pytesseract 假设一个统一的文本块。预处理也很重要,因此我们阈值以获得所需的黑色前景文本和白色背景的二进制图像。在此处查找其他 Pytesseract 配置选项。阈值化后,这是我们放入 Pytesseract 的图像
这是输出
Limit Balance
Sep 29, 2015 $17,750.0 Oct 01, 2018 $0.00 Oct 02, 2018
0
Account Condition: Paid account/zero Account #: Delinquency 30 Days = $0.00 | 60 Days =$0.00 90+ Days =$0.00 | Derog =00
balance 4636676005495602 Counter (Past
seven years)
Payment Status: This is an account in good Responsibility: Individual
standing
Account Type: Credit Card Account Term: REV
# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2016 0 0 0
2017 0 0 0 0 0 0 0 0 0 0 0 0
2018 0 0 0 0 0 0 0 0 0 B
> BMW FINANCIAL SERVICES /
2602980
Open Date Original Amount Credit Status Date Chargeoff Amount Past Due Last Paid Date Balance Date Current
Limit Balance
Sep 19, 2015 $27,189.00 Jul01, 2017 $0.00 Jul 21, 2017 Jul 24, 2017
Account Condition: Paid account/zero Account #: 4002206279 Delinquency 30 Days = $0.00 | 60 Days =$0.00 90+ Days =$0.00 | Derog =00
balance Counter (Past
seven years)
Payment Status: This is an account in good Responsibility: Individual
standing
Account Type: Auto Lease Account Term: 036
# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2015 Cc Cc Cc Cc
2016 Cc Cc Cc Cc Cc Cc Cc Cc Cc Cc Cc Cc
2017 Cc Cc Cc Cc Cc Cc B
> LEXUS FINANCIAL SERVIC /
1624210
Open Date Original Amount Credit Status Date Chargeoff Amount Past Due Last Paid Date Balance Date Current
Limit Balance
Mar 07, 2015 $40,342.00 Jul01, 2016 $0.00 Jul 05, 2016 Jul 31, 2016
Account Condition: Paid account/zero Account #: Delinquency 30 Days = $0.00 | 60 Days =$0.00 90+ Days =$0.00 | Derog =00
balance 70403662535410001 Counter (Past
seven years)
Payment Status: This is an account in good Responsibility: Individual
standing
Account Type: Auto Loan Account Term: 072
# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2014
2015 Cc Cc Cc Cc Cc Cc Cc Cc Cc Cc
2016 Cc Cc Cc Cc Cc Cc B
> AES/SUNTRUST BANK / 9997195
Open Date Original Amount Credit Status Date Chargeoff Amount Past Due Last Paid Date Balance Date Current
Limit Balance
Sep 19, 2008 $12,500.00 Apr 01, 2016 $0.00 Apr 21, 2016 Apr 30, 2016
Account Condition: Paid account/zero Account #: Delinquency 30 Days = $0.00 | 60 Days =$0.00 90+ Days =$0.00 | Derog =00
balance 5046237209PA00001 Counter (Past
seven years)
Payment Status: This is an account in good Responsibility: Signer
standing
Account Type: Education Loan Account Term: 300
# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2014 Cc Cc Cc Cc Cc Cc Cc Cc Cc
2015 Cc Cc Cc Cc Cc Cc Cc Cc Cc Cc Cc Cc
2016 Cc Cc Cc B
> BARCLAYS BANK DELAWARE /
1223850
Open Date Original Amount Credit Status Date Chargeoff Amount Past Due Last Paid Date Balance Date Current
Limit Balance
Apr 04, 2013 $3,500.00 Apr 01, 2016 $0.00 Oct 06, 2014 Apr 05, 2016
Account Condition: Paid account/zero Account #: 000176863399109 Delinquency 30 Days = $0.00 | 60 Days =$0.00 90+ Days =$0.00 | Derog =00
balance Counter (Past
seven years)
Payment Status: This is an account in good Responsibility: Individual
standing
Account Type: Credit Card Account Term: REV
# Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2014 Cc Cc Cc Cc Cc Cc Cc Cc 0
2015 0 0 0 0 0 0 0 0 0 0 0 0
2016 0 0 0 B
> AMERICAN HONDA FINANCE /
1605190
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
data = pytesseract.image_to_string(thresh, lang='eng',config='--psm 6')
print(data)