我正在尝试阅读 SimpleCaptcha 生成的验证码:
我设法删除了渐变和颜色:
import cv2 as cv
import numpy as np
import PyDIP as dip
import pytesseract
img = cv.imread('capt.jpg')
img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
img = cv.adaptiveThreshold(img,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv.THRESH_BINARY,11,2)
但是,我无法删除曲线或填充字母。
我已经尝试了那里的代码并得到了这个:
lines = dip.PathOpening(img, length=400, mode={'constrained'})
img = img-lines
img = 255 - img
lines = np.array(lines)
使用另一种方法,我得到:
# image is the previous img
gray = cv.cvtColor(image,cv.COLOR_BGR2GRAY)
thresh = cv.threshold(gray, 0, 255, cv.THRESH_BINARY_INV + cv.THRESH_OTSU)[1]
horizontal_kernel = cv.getStructuringElement(cv.MORPH_RECT, (25,1))
detected_lines = cv.morphologyEx(thresh, cv.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv.findContours(detected_lines, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv.drawContours(image, [c], -1, (255,255,255), 2)
repair_kernel = cv.getStructuringElement(cv.MORPH_RECT, (1,6))
result = 255 - cv.morphologyEx(255 - image, cv.MORPH_CLOSE, repair_kernel, iterations=1)
# results:
# detected_lines:
我正在尝试阅读文本:
captcha_val=pytesseract.image_to_string(img)
这就是验证码的来源。有什么帮助吗?