ios - iOS UIImage Binarization for OCR - handling images with varying luminance

Question

I had a C++ binarization routine that I used for later OCR operation. However I found that it produced unnecessary slanting of text. Searching for alternatives I found GPUImage of great value and it solved the slanting issue.

I am using GPUImage code like this to binarize my input images before applying OCR.

However the threshold value does not cover the range of images I get. See two samples from my input images:

enter image description here

I can't handle both with same threshold value. Low value seems to be fine with later, and higher value is fine with first one.

The second image seems to be of special complexity because I never get all the chars to be binarized right, irrespective of what value I set for threshold. On the other hand, my C++ binarization routine seems to do it right, but I don't have much insights to experiment into it like simplistic threshold value in GPUImage.

How should I handle that?

UPDATE:

I tried with GPUImageAverageLuminanceThresholdFilter with default multiplier = 1. It works fine with first image but the second image continues to be problem.

Some more diverse inputs for binarization:

enter image description here

UPDATE II:

After going through this answer by Brad, tried GPUImageAdaptiveThresholdFilter (also incorporating GPUImagePicture because earlier I was only applying it on UIImage).

With this, I got second image binarized perfect. However first one seems to have lot of noise after binarization when I set blur size is 3.0. OCR results in extra characters added. With lower value of blur size, second image loses precision.

Here it is:

+(UIImage *)binarize : (UIImage *) sourceImage
{
    UIImage * grayScaledImg = [self toGrayscale:sourceImage];
    GPUImagePicture *imageSource = [[GPUImagePicture alloc] initWithImage:grayScaledImg];
    GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
    stillImageFilter.blurSize = 3.0;    

    [imageSource addTarget:stillImageFilter];   
    [imageSource processImage];        

    UIImage *imageWithAppliedThreshold = [stillImageFilter imageFromCurrentlyProcessedOutput];
  //  UIImage *destImage = [thresholdFilter imageByFilteringImage:grayScaledImg];
    return imageWithAppliedThreshold;
}

score 2 · Accepted Answer

对于预处理步骤，您需要在此处进行自适应阈值处理。

我使用opencv灰度和自适应阈值方法得到了这些结果。也许加上低通噪声过滤（高斯或中值）它应该像一个魅力。

各种各样的

我使用provisia（它是一个帮助您快速处理图像的 ui）来获得我需要的块大小：43 用于您在此处提供的图像。如果您从更近或更远的位置拍摄照片，块大小可能会发生变化。如果您想要一个通用算法，您需要开发一个应该搜索最佳大小的算法（搜索直到检测到数字）

编辑：我刚刚看到最后一张图片。它小得无法治疗。即使你应用了最好的预处理算法，你也不会检测到这些数字。采样不是解决方案，因为会出现噪音。

score 1 · Accepted Answer

我终于自己探索了，这是我的GPUImage过滤器结果：

+ (UIImage *) doBinarize:(UIImage *)sourceImage
{
    //first off, try to grayscale the image using iOS core Image routine
    UIImage * grayScaledImg = [self grayImage:sourceImage];
    GPUImagePicture *imageSource = [[GPUImagePicture alloc] initWithImage:grayScaledImg];
    GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
    stillImageFilter.blurSize = 8.0;

    [imageSource addTarget:stillImageFilter];
    [imageSource processImage];

    UIImage *retImage = [stillImageFilter imageFromCurrentlyProcessedOutput];
    return retImage;
}

+ (UIImage *) grayImage :(UIImage *)inputImage
{    
    // Create a graphic context.
    UIGraphicsBeginImageContextWithOptions(inputImage.size, NO, 1.0);
    CGRect imageRect = CGRectMake(0, 0, inputImage.size.width, inputImage.size.height);

    // Draw the image with the luminosity blend mode.
    // On top of a white background, this will give a black and white image.
    [inputImage drawInRect:imageRect blendMode:kCGBlendModeLuminosity alpha:1.0];

    // Get the resulting image.
    UIImage *outputImage = UIGraphicsGetImageFromCurrentImageContext();
    UIGraphicsEndImageContext();

    return outputImage;
}

我使用这个实现了近 90% - 我确信一定有更好的选择，但我尽blurSize我所能尝试，并且 8.0 是适用于我的大多数输入图像的值。

对于其他人，祝你努力！

score 0 · Accepted Answer

SWIFT3

解决方案 1

extension UIImage {

func doBinarize() -> UIImage? {

    let grayScaledImg = self.grayImage()
    let imageSource = GPUImagePicture(image: grayScaledImg)
    let stillImageFilter = GPUImageAdaptiveThresholdFilter()
    stillImageFilter.blurRadiusInPixels = 8.0 

    imageSource!.addTarget(stillImageFilter)
    stillImageFilter.useNextFrameForImageCapture()
    imageSource!.processImage()


    guard let retImage: UIImage = stillImageFilter.imageFromCurrentFramebuffer(with: UIImageOrientation.up) else {
        print("unable to obtain UIImage from filter")
        return nil
    }

    return retImage
}

func grayImage() -> UIImage? {
    UIGraphicsBeginImageContextWithOptions(self.size, false, 1.0)
    let imageRect = CGRect(x: 0, y: 0, width: self.size.width, height: self.size.height)

    self.draw(in: imageRect, blendMode: .luminosity, alpha:  1.0)

    let outputImage = UIGraphicsGetImageFromCurrentImageContext()
    UIGraphicsEndImageContext()

    return outputImage
}


}

结果将是

解决方案 2

用于GPUImageLuminanceThresholdFilter实现无灰色的 100% 黑白效果

   let stillImageFilter = GPUImageLuminanceThresholdFilter() 
   stillImageFilter.threshold = 0.9

例如，我需要检测闪光灯，这对我有用

ios - iOS UIImage Binarization for OCR - handling images with varying luminance

3 回答 3

Related

Reference