1

我正在尝试阅读 SimpleCaptcha 生成的验证码:

验证码

我设法删除了渐变和颜色:

import cv2 as cv
import numpy as np
import PyDIP as dip
import pytesseract

img = cv.imread('capt.jpg')
img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
img = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,\
        cv.THRESH_BINARY,11,2)

验证码 bw

但是,我无法删除曲线或填充字母。

我已经尝试了那里的代码并得到了这个:

lines = dip.PathOpening(img, length=400, mode={'constrained'})
img = img-lines
img = 255 - img 

没有线条

lines = np.array(lines)

检测到的线

使用另一种方法,我得到:

# image is the previous img

gray = cv.cvtColor(image,cv.COLOR_BGR2GRAY)
thresh = cv.threshold(gray, 0, 255, cv.THRESH_BINARY_INV + cv.THRESH_OTSU)[1]

horizontal_kernel = cv.getStructuringElement(cv.MORPH_RECT, (25,1))
detected_lines = cv.morphologyEx(thresh, cv.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv.findContours(detected_lines, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    cv.drawContours(image, [c], -1, (255,255,255), 2)

repair_kernel = cv.getStructuringElement(cv.MORPH_RECT, (1,6))
result = 255 - cv.morphologyEx(255 - image, cv.MORPH_CLOSE, repair_kernel, iterations=1)

# results:

没有线条

# detected_lines:

检测到的线

我正在尝试阅读文本:

captcha_val=pytesseract.image_to_string(img)

这就是验证码的来源。有什么帮助吗?

4

0 回答 0