ocr - 如何使用 pytesseract 进行未知方向的文本识别任务？

Question

我有一个看起来像这样的图像：

我想使用 pytesseract 检测和识别该图像中的文本，但最新的 pytesseract 0.3.8 为我提供了该图像的空输出。我猜这是因为图像中的钓鱼国民身份证（给我们非水平文本），是有什么方法可以使用 pytesseract 从该图像中旋转和裁剪国民身份证？或者 pytesseract 是否可以自动识别图像中弯曲或未知方向的文本？我尝试了在这篇文章中讨论的代码：如何增强 OCR 的 Tesseract 自动文本旋转功能？

这是我尝试过的代码：

import pytesseract
import cv2
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import re
import math
#from skimage.transform import rotate

# function to rotate the image
def rotate(image: np.ndarray,angle, background_color): 
    old_width, old_height = image.shape[:2]
    angle_radian = math.radians(angle)
    width = abs(np.sin(angle_radian) * old_height) + abs(np.cos(angle_radian) * old_width)
    height = abs(np.sin(angle_radian) * old_width) + abs(np.cos(angle_radian) * old_height)
    image_center = tuple(np.array(image.shape[1::-1]) / 2)
    rot_mat = cv2.getRotationMatrix2D(image_center, angle, 1.0)  
    rot_mat[1, 2] += (width - old_width) / 2
    rot_mat[0, 2] += (height - old_height) / 2
    return cv2.warpAffine(image, rot_mat, (int(round(height)), int(round(width))), borderValue=background_color)

image = cv2.imread('tests/t3.png')
while True:
    osd_rotated_image = pytesseract.image_to_osd(image)

    # using regex we search for the angle(in string format) of the text
    angle_rotated_image = re.search('(?<=Rotate: )\d+', osd_rotated_image).group(0)
    print(angle_rotated_image)
    if (angle_rotated_image == '0'):
        image = image
        plt.imshow(image)
        # break the loop once we get the correctly deskewed image
        break
    elif (angle_rotated_image == '90'):
        image = rotate(image,90,(255,255,255)) # rotate(image,angle,background_color)
        continue
    elif (angle_rotated_image == '180'):
        image = rotate(image,180,(255,255,255))
        continue
    elif (angle_rotated_image == '270'):
        image = rotate(image,90,(255,255,255))
        continue

它实际上旋转了整个图像并且不能旋转图像内部的 NID 卡，所以错误的输出看起来像这样：

我想识别 NID 卡中存在的所有英文文本，如果不可能，那么至少我想使用 pytesseract 仔细识别任何未知方向的图像的 NID 号，我知道 paddleocr 和 easyocr 可以处理这样的图像但是我想知道是否可以使 pytesseract 文本识别适用于这样的图像？如果可以，我该怎么做？我还能认出这张图片中的所有单词吗？例如：bangla,english,english numbers using pytesseract???谢谢

ocr - 如何使用 pytesseract 进行未知方向的文本识别任务？

0 回答 0

Related

Reference