0

我想使用 OpenCV 拆分图像的字符以训练 Tesseract 模型。

我正在使用3.1.0版本(因为 Macports 升级 - meh..),并且文档(用于 Python)仍然不是很清楚/记录良好。

这是我所做的:

  1. 二值化图像以便准备好找到轮廓
  2. 找到轮廓(有效 - 至少我得到非零结果)
  3. 对于每个轮廓:

    1. 创建一个掩码(这可能会失败 - 得到零)
    2. 使用蒙版从原始图像中提取部分(应该可以工作,但蒙版失败,所以这还不行)

新版本的 OpenCV 也有一些不同的语法,所以有时这让它变得更加棘手。

这是我的代码:

def characterSplit(img):
    """
    Splits the characters in an image using contours, ready to be labelled and saved for training with Tesseract
    """

    # Apply Thresholding to binarize Image
    img = cv2.GaussianBlur(img, (3,3), 0)
    img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 75, 10)
    img = cv2.bitwise_not(img)

    # Find Contours
    contours = cv2.findContours(img, cv2.RETR_EXTERNAL , cv2.CHAIN_APPROX_TC89_KCOS, offset=(0,0))[1]

    # Iterate through the contours
    for c in xrange(len(contours)):
        mask = numpy.zeros(img.size)

        cv2.drawContours(mask, contours, c, (0, 255, 0), cv2.FILLED)    # Mask is zeros - It might fail here!

        # Where the result will be stored
        res = numpy.zeros(mask.size)

        # Make a Boolean-type numpy array for the Mask
        amsk = mask != 0

        # I use this to copy a part of the image using the generated mask.
        # The result is zeros because the mask is also zeros
        numpy.copyto(res, img.flatten(), where = amsk)

        ## (... Reshape, crop and save the result ...)

据我所知,蒙版应该与原始图像的大小相同。但它也应该具有相同的形状吗?例如,我的图像是 640x74,但我创建掩码矩阵的方式是 1x47360。也许这就是它失败的原因...... (但不会抛出任何错误)

任何帮助表示赞赏!

4

1 回答 1

1

我最终做了三木在评论中提出的建议。我曾经cv::connectedComponents做过角色分割。下面是对应的代码,有兴趣的朋友可以看看:

def characterSplit(img, outputFolder=''):
    # Splits the Image (OpenCV Object) into distinct characters and exports it in images withing the specified folder.

    # Blurring the image with Gaussian before thresholding is important
    img = cv2.GaussianBlur(img, (3,3), 0)
    img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 75, 10)
    img = cv2.bitwise_not(img)

    output = cv2.connectedComponentsWithStats(img, 8, cv2.CV_16S)
    n_labels =  output[0]
    labels = output[1]
    stats = output[2]

    for c in xrange(n_labels):
        # Mask is a boolean-type numpy array with True in the corresponding region
        mask = labels == c

        res = numpy.zeros(img.shape)
        numpy.copyto(res, img, where=mask)

        # The rectangle that bounds the region is stored in:
        # stats[c][0:4] -> [x, y, w, h]

        cv2.imwrite("region_{}.jpg".format(c), res)

希望这可以帮助!

于 2016-12-01T18:28:14.840 回答