我在这里问了一个类似的问题,但更多地集中在 tesseract 上。
我有一个示例图像,如下所示。我想将白色正方形设为我的感兴趣区域,然后裁剪掉该部分(正方形)并用它创建一个新图像。我将使用不同的图像,因此正方形不会在所有图像中始终位于同一位置。所以我需要以某种方式检测正方形的边缘。
我可以执行哪些预处理方法来获得结果?
使用您的测试图像,我能够通过简单的腐蚀操作消除所有噪音。
之后,寻找角像素的简单迭代是微不足道的,我在Mat
这个答案中谈到了这一点。出于测试目的,我们可以在这些点之间绘制绿线以显示原始图像中我们感兴趣的区域:
最后,我在原始图像中设置了 ROI 并裁剪了该部分。
最终结果如下图所示:
我编写了一个示例代码,使用OpenCV的C++ 接口执行此任务。我对您将这段代码翻译成 Python 的技能充满信心。如果您做不到,请忘记代码并坚持我在此答案中分享的路线图。
#include <cv.h>
#include <highgui.h>
int main(int argc, char* argv[])
{
cv::Mat img = cv::imread(argv[1]);
std::cout << "Original image size: " << img.size() << std::endl;
// Convert RGB Mat to GRAY
cv::Mat gray;
cv::cvtColor(img, gray, CV_BGR2GRAY);
std::cout << "Gray image size: " << gray.size() << std::endl;
// Erode image to remove unwanted noises
int erosion_size = 5;
cv::Mat element = cv::getStructuringElement(cv::MORPH_CROSS,
cv::Size(2 * erosion_size + 1, 2 * erosion_size + 1),
cv::Point(erosion_size, erosion_size) );
cv::erode(gray, gray, element);
// Scan the image searching for points and store them in a vector
std::vector<cv::Point> points;
cv::Mat_<uchar>::iterator it = gray.begin<uchar>();
cv::Mat_<uchar>::iterator end = gray.end<uchar>();
for (; it != end; it++)
{
if (*it)
points.push_back(it.pos());
}
// From the points, figure out the size of the ROI
int left, right, top, bottom;
for (int i = 0; i < points.size(); i++)
{
if (i == 0) // initialize corner values
{
left = right = points[i].x;
top = bottom = points[i].y;
}
if (points[i].x < left)
left = points[i].x;
if (points[i].x > right)
right = points[i].x;
if (points[i].y < top)
top = points[i].y;
if (points[i].y > bottom)
bottom = points[i].y;
}
std::vector<cv::Point> box_points;
box_points.push_back(cv::Point(left, top));
box_points.push_back(cv::Point(left, bottom));
box_points.push_back(cv::Point(right, bottom));
box_points.push_back(cv::Point(right, top));
// Compute minimal bounding box for the ROI
// Note: for some unknown reason, width/height of the box are switched.
cv::RotatedRect box = cv::minAreaRect(cv::Mat(box_points));
std::cout << "box w:" << box.size.width << " h:" << box.size.height << std::endl;
// Draw bounding box in the original image (debugging purposes)
//cv::Point2f vertices[4];
//box.points(vertices);
//for (int i = 0; i < 4; ++i)
//{
// cv::line(img, vertices[i], vertices[(i + 1) % 4], cv::Scalar(0, 255, 0), 1, CV_AA);
//}
//cv::imshow("Original", img);
//cv::waitKey(0);
// Set the ROI to the area defined by the box
// Note: because the width/height of the box are switched,
// they were switched manually in the code below:
cv::Rect roi;
roi.x = box.center.x - (box.size.height / 2);
roi.y = box.center.y - (box.size.width / 2);
roi.width = box.size.height;
roi.height = box.size.width;
std::cout << "roi @ " << roi.x << "," << roi.y << " " << roi.width << "x" << roi.height << std::endl;
// Crop the original image to the defined ROI
cv::Mat crop = img(roi);
// Display cropped ROI
cv::imshow("Cropped ROI", crop);
cv::waitKey(0);
return 0;
}
看到文本是唯一的大斑点,其他一切都只比像素大一点,一个简单的形态开口就足够了
之后,白色矩形应该是图像中唯一剩下的东西。您可以使用 opencvs findcontours、用于 opencv 的 CvBlobs 库或使用 imagemagick -crop 函数找到它
这是您的图像,其中应用了 2 个腐蚀步骤,然后应用了 2 个膨胀步骤: 您可以简单地将这个图像插入到 opencv findContours 函数中,如Squares 教程示例中所示以获取位置
#objective:
#1)compress large images to less than 1000x1000
#2)identify region of interests
#3)save rois in top to bottom order
import cv2
import os
def get_contour_precedence(contour, cols):
tolerance_factor = 10
origin = cv2.boundingRect(contour)
return ((origin[1] // tolerance_factor) * tolerance_factor) * cols + origin[0]
# Load image, grayscale, Gaussian blur, adaptive threshold
image = cv2.imread('./images/sample_0.jpg')
#compress the image if image size is >than 1000x1000
height, width, color = image.shape #unpacking tuple (height, width, colour) returned by image.shape
while(width > 1000):
height = height/2
width = width/2
print(int(height), int(width))
height = int(height)
width = int(width)
image = cv2.resize(image, (width, height))
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (9,9), 0)
thresh = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV,11,30)
# Dilate to combine adjacent text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
ret,thresh3 = cv2.threshold(image,127,255,cv2.THRESH_BINARY_INV)
dilate = cv2.dilate(thresh, kernel, iterations=4)
# Find contours, highlight text areas, and extract ROIs
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
#cnts = cv2.findContours(thresh3, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
#ORDER CONTOURS top to bottom
cnts.sort(key=lambda x:get_contour_precedence(x, image.shape[1]))
#delete previous roi images in folder roi to avoid
dir = './roi/'
for f in os.listdir(dir):
os.remove(os.path.join(dir, f))
ROI_number = 0
for c in cnts:
area = cv2.contourArea(c)
if area > 10000:
x,y,w,h = cv2.boundingRect(c)
#cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 3)
cv2.rectangle(image, (x, y), (x + w, y + h), (100,100,100), 1)
#use below code to write roi when results are good
ROI = image[y:y+h, x:x+w]
cv2.imwrite('roi/ROI_{}.jpg'.format(ROI_number), ROI)
ROI_number += 1
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()