107

What is the best way to detect the corners of an invoice/receipt/sheet-of-paper in a photo? This is to be used for subsequent perspective correction, before OCR.

My current approach has been:

RGB > Gray > Canny Edge Detection with thresholding > Dilate(1) > Remove small objects(6) > clear boarder objects > pick larges blog based on Convex Area. > [corner detection - Not implemented]

I can't help but think there must be a more robust 'intelligent'/statistical approach to handle this type of segmentation. I don't have a lot of training examples, but I could probably get 100 images together.

Broader context:

I'm using matlab to prototype, and planning to implement the system in OpenCV and Tesserect-OCR. This is the first of a number of image processing problems I need to solve for this specific application. So I'm looking to roll my own solution and re-familiarize myself with image processing algorithms.

Here are some sample image that I'd like the algorithm to handle: If you'd like to take up the challenge the large images are at http://madteckhead.com/tmp

case 1
(source: madteckhead.com)

case 2
(source: madteckhead.com)

case 3
(source: madteckhead.com)

case 4
(source: madteckhead.com)

In the best case this gives:

case 1 - canny
(source: madteckhead.com)

case 1 - post canny
(source: madteckhead.com)

case 1 - largest blog
(source: madteckhead.com)

However it fails easily on other cases:

case 2 - canny
(source: madteckhead.com)

case 2 - post canny
(source: madteckhead.com)

case 2 - largest blog
(source: madteckhead.com)

Thanks in advance for all the great ideas! I love SO!

EDIT: Hough Transform Progress

Q: What algorithm would cluster the hough lines to find corners? Following advice from answers I was able to use the Hough Transform, pick lines, and filter them. My current approach is rather crude. I've made the assumption the invoice will always be less than 15deg out of alignment with the image. I end up with reasonable results for lines if this is the case (see below). But am not entirely sure of a suitable algorithm to cluster the lines (or vote) to extrapolate for the corners. The Hough lines are not continuous. And in the noisy images, there can be parallel lines so some form or distance from line origin metrics are required. Any ideas?

case 1 case 2 case 3 case 4
(source: madteckhead.com)

4

8 回答 8

30

我是 Martin 的朋友,他今年早些时候正在研究这个问题。这是我的第一个编码项目,有点匆忙结束,所以代码需要一些错误......解码......我将根据我已经看到你所做的事情给出一些提示,然后在我明天休息的时候整理我的代码。

第一个提示,OpenCV很棒python,尽快移到他们那里。:D

不是去除小物体和/或噪音,而是降低精明的约束,因此它接受更多的边缘,然后找到最大的闭合轮廓(在 OpenCV 中使用findcontour()一些简单的参数,我想我用过CV_RETR_LIST)。当它在一张白纸上时可能仍然会挣扎,但肯定会提供最佳效果。

对于Houghline2()变换,尝试使用 ,CV_HOUGH_STANDARD而不是CV_HOUGH_PROBABILISTIC,它将给出rhotheta值,在极坐标中定义线,然后您可以将线分组在一定的公差范围内。

我的分组用作查找表,对于从霍夫变换输出的每一行,它都会给出一个 rho 和 theta 对。如果这些值在表中一对值的 5% 之内,则将它们丢弃,如果它们在 5% 之外,则将新条目添加到表中。

然后,您可以更轻松地分析平行线或线之间的距离。

希望这可以帮助。

于 2011-07-10T22:47:23.550 回答
20

这是我经过一番实验后得出的结论:

import cv, cv2, numpy as np
import sys

def get_new(old):
    new = np.ones(old.shape, np.uint8)
    cv2.bitwise_not(new,new)
    return new

if __name__ == '__main__':
    orig = cv2.imread(sys.argv[1])

    # these constants are carefully picked
    MORPH = 9
    CANNY = 84
    HOUGH = 25

    img = cv2.cvtColor(orig, cv2.COLOR_BGR2GRAY)
    cv2.GaussianBlur(img, (3,3), 0, img)


    # this is to recognize white on white
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(MORPH,MORPH))
    dilated = cv2.dilate(img, kernel)

    edges = cv2.Canny(dilated, 0, CANNY, apertureSize=3)

    lines = cv2.HoughLinesP(edges, 1,  3.14/180, HOUGH)
    for line in lines[0]:
         cv2.line(edges, (line[0], line[1]), (line[2], line[3]),
                         (255,0,0), 2, 8)

    # finding contours
    contours, _ = cv2.findContours(edges.copy(), cv.CV_RETR_EXTERNAL,
                                   cv.CV_CHAIN_APPROX_TC89_KCOS)
    contours = filter(lambda cont: cv2.arcLength(cont, False) > 100, contours)
    contours = filter(lambda cont: cv2.contourArea(cont) > 10000, contours)

    # simplify contours down to polygons
    rects = []
    for cont in contours:
        rect = cv2.approxPolyDP(cont, 40, True).copy().reshape(-1, 2)
        rects.append(rect)

    # that's basically it
    cv2.drawContours(orig, rects,-1,(0,255,0),1)

    # show only contours
    new = get_new(img)
    cv2.drawContours(new, rects,-1,(0,255,0),1)
    cv2.GaussianBlur(new, (9,9), 0, new)
    new = cv2.Canny(new, 0, CANNY, apertureSize=3)

    cv2.namedWindow('result', cv2.WINDOW_NORMAL)
    cv2.imshow('result', orig)
    cv2.waitKey(0)
    cv2.imshow('result', dilated)
    cv2.waitKey(0)
    cv2.imshow('result', edges)
    cv2.waitKey(0)
    cv2.imshow('result', new)
    cv2.waitKey(0)

    cv2.destroyAllWindows()

不完美,但至少适用于所有样本:

1 2 3 4

于 2013-09-20T02:42:09.933 回答
19

我大学的一个学生小组最近演示了一个 iPhone 应用程序(和 python OpenCV 应用程序),他们为此编写了该应用程序。我记得,步骤是这样的:

  • 中值过滤器可完全删除纸上的文字(这是白纸上的手写文字,光线相当好,可能不适用于印刷文字,效果很好)。原因是它使角点检测更容易。
  • 线的霍夫变换
  • 找到霍夫变换累加器空间中的峰值,并在整个图像上绘制每条线。
  • 分析线条并删除任何彼此非常接近且角度相似的线条(将线条聚集成一条)。这是必要的,因为霍夫变换并不完美,因为它在离散的样本空间中工作。
  • 找到大致平行且与其他线对相交的线对,以查看哪些线形成四边形。

这似乎工作得很好,他们能够拍摄一张纸或书的照片,执行角点检测,然后几乎实时地将图像中的文档映射到平面上(只需执行一个 OpenCV 函数映射)。当我看到它工作时,没有 OCR。

于 2011-07-02T08:03:24.057 回答
10

您可以使用角检测,而不是从边缘检测开始。

Marvin 框架为此提供了 Moravec 算法的实现。你可以找到纸的角落作为起点。下面是 Moravec 算法的输出:

在此处输入图像描述

于 2013-05-25T15:42:55.170 回答
5

您也可以在 Sobel 算子结果上使用MSER(最大稳定极值区域)来找到图像的稳定区域。对于 MSER 返回的每个区域,您可以应用凸包和多边形近似来获得如下结果:

但是这种检测对于实时检测来说比单张图片更有用,它并不总是返回最好的结果。

结果

于 2015-08-17T10:36:57.837 回答
3

边缘检测后,使用霍夫变换。然后,将这些点与它们的标签一起放入一个 SVM(支持向量机)中,如果示例上有平滑的线条,那么 SVM 将不会有任何困难来划分示例的必要部分和其他部分。我对 SVM 的建议是设置一个参数,例如连接性和长度。也就是说,如果点是相连的并且很长,它们很可能是收据的一条线。然后,您可以消除所有其他点。

于 2011-07-02T22:17:31.147 回答
3

这里有 @Vanuan 使用 C++ 的代码:

cv::cvtColor(mat, mat, CV_BGR2GRAY);
cv::GaussianBlur(mat, mat, cv::Size(3,3), 0);
cv::Mat kernel = cv::getStructuringElement(cv::MORPH_RECT, cv::Point(9,9));
cv::Mat dilated;
cv::dilate(mat, dilated, kernel);

cv::Mat edges;
cv::Canny(dilated, edges, 84, 3);

std::vector<cv::Vec4i> lines;
lines.clear();
cv::HoughLinesP(edges, lines, 1, CV_PI/180, 25);
std::vector<cv::Vec4i>::iterator it = lines.begin();
for(; it!=lines.end(); ++it) {
    cv::Vec4i l = *it;
    cv::line(edges, cv::Point(l[0], l[1]), cv::Point(l[2], l[3]), cv::Scalar(255,0,0), 2, 8);
}
std::vector< std::vector<cv::Point> > contours;
cv::findContours(edges, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_TC89_KCOS);
std::vector< std::vector<cv::Point> > contoursCleaned;
for (int i=0; i < contours.size(); i++) {
    if (cv::arcLength(contours[i], false) > 100)
        contoursCleaned.push_back(contours[i]);
}
std::vector<std::vector<cv::Point> > contoursArea;

for (int i=0; i < contoursCleaned.size(); i++) {
    if (cv::contourArea(contoursCleaned[i]) > 10000){
        contoursArea.push_back(contoursCleaned[i]);
    }
}
std::vector<std::vector<cv::Point> > contoursDraw (contoursCleaned.size());
for (int i=0; i < contoursArea.size(); i++){
    cv::approxPolyDP(Mat(contoursArea[i]), contoursDraw[i], 40, true);
}
Mat drawing = Mat::zeros( mat.size(), CV_8UC3 );
cv::drawContours(drawing, contoursDraw, -1, cv::Scalar(0,255,0),1);
于 2013-09-29T21:10:29.237 回答
1
  1. 转换为实验室空间

  2. 使用 kmeans 段 2 集群

  3. 然后在其中一个集群(内部)上使用轮廓或霍夫
于 2014-10-29T06:37:17.447 回答