python - 在不缩放的情况下改进图像的 OCR（使用 PIL、pixbuf）？

Question

我正在尝试在屏幕截图上进行 OCR 识别，在屏幕截图（桌面区域，您单击的区域）后，它转到pibxbuffer，其中内容转到pytesseract。但是使用 pixbuffer 后图像质量很差：它是倾斜的（我试图将它保存在一个目录中，而不是 pixbuffer，并查看了它）。

def takeScreenshot(self, x, y, width = 150, height = 30): 
    self.width=width 
    self.height=height 
    window = Gdk.get_default_root_window() 
    #x, y, width, height = window.get_geometry() 

    #print("The size of the root window is {} x {}".format(width, height)) 

    # get_from_drawable() was deprecated. See: 
    # https://developer.gnome.org/gtk3/stable/ch24s02.html#id-1.6.3.4.7 
    pixbufObj = Gdk.pixbuf_get_from_window(window, x, y, width, height) 
    height = pixbufObj.get_height() 
    width = pixbufObj.get_width() 
    image = Image.frombuffer("RGB", (width, height), 
                             pixbufObj.get_pixels(), 'raw', 'RGB', 0, 1) 
    image = image.resize((width*20,height*20), Image.ANTIALIAS) 
    #image.save("saved.png") 
    print(pytesseract.image_to_string(image)) 

    print("takenScreenshot:",x,y)

当我将图像保存到目录时，它没问题（质量）并且识别效果很好。
尝试没有Image.ANTIALIAS- 没有区别。

（缩放20的目的：我尝试了识别保存在目录中的图像的代码，没有缩放识别质量很差。）

糟糕的画面

问题是图像偏斜。

score 2 · Accepted Answer

我不知道您是否仍在寻找解决方案，但我遇到了图像歪斜的相同问题。这是某种填充问题GdkPixBuf。基本上，图像的height和width应该始终是divisible by 8. 所以这就是我在截屏之前所做的：

width = width + (8 - (width % 8))
height = height + (8 - (height % 8))

执行此操作后，屏幕截图应该可以工作。

您可以在此处阅读有关该问题的更多信息

score 2 · Accepted Answer

这种极端缩放通常对 OCR 不利，特别是在全彩色和特殊处理（抗锯齿）中

我会：

减少高档（没有？），或使用 NEAREST
加载后立即转换为灰度（以避免您看到的伪影）：
```
image = image.convert('L')
```

python - 在不缩放的情况下改进图像的 OCR（使用 PIL、pixbuf）？

2 回答 2

Related

Reference