python-3.x - 如何使用 VIPS 进行图像标准化？

Question

我想标准化一组图像的曝光和调色板。对于上下文，这是用于在医学图像的图像分类中训练神经网络。我也在为数十万张图像执行此操作，因此效率非常重要。

到目前为止，我一直在使用 VIPS，特别是 PyVIPS，并且更喜欢使用该库的解决方案。在找到这个答案并查看文档后，我尝试了

x = pyvips.Image.new_from_file('test.ndpi')
x = x.hist_norm()
x.write_to_file('test_normalized.tiff')

但这似乎总是会产生纯白色的图像。

score 2 · Accepted Answer

您需要hist_equal进行直方图均衡。

主要文档在这里：

https://libvips.github.io/libvips/API/current/libvips-histogram.html

但是，对于大型幻灯片图像，这将非常慢。它需要扫描整个幻灯片一次以构建直方图，然后再次扫描以使其均衡。找到低分辨率层的直方图会快得多，然后用它来均衡高分辨率层。

例如：

#!/usr/bin/env python3

import sys
import pyvips

# open the slide image and get the number of layers ... we are not fetching 
# pixels, so this is quick
x = pyvips.Image.new_from_file(sys.argv[1])
levels = int(x.get("openslide.level-count"))

# find the histogram of the highest level ... again, this should be quick
x = pyvips.Image.new_from_file(sys.argv[1], 
                               level=levels - 1)
hist = x.hist_find()

# from that, compute the transform for histogram equalisation
equalise = hist.hist_cum().hist_norm()

# and use that on the full-res image
x = pyvips.Image.new_from_file(sys.argv[1])

x = x.maplut(equalise)

x.write_to_file(sys.argv[2])

另一个因素是直方图均衡是非线性的，因此会扭曲亮度关系。它还可以扭曲颜色关系并使噪声和压缩伪影看起来很疯狂。我在这里的图像上尝试了该程序：

$ ~/try/equal.py bild.ndpi[level=7] y.jpg

条纹来自幻灯片扫描仪，而丑陋的条纹来自压缩。

我想我会从低分辨率级别找到图像最大值和最小值，然后使用它们对像素值进行简单的线性拉伸。

就像是：

x = pyvips.Image.new_from_file(sys.argv[1])
levels = int(x.get("openslide.level-count"))
x = pyvips.Image.new_from_file(sys.argv[1],
                               level=levels - 1)
mn = x.min()
mx = x.max()
x = pyvips.Image.new_from_file(sys.argv[1])
x = (x - mn) * (256 / (mx - mn))
x.write_to_file(sys.argv[2])

你发现Regionpyvips的新功能了吗？它可以更快地生成用于训练的补丁，在某些情况下可以快 100 倍：

https://github.com/libvips/pyvips/issues/100#issuecomment-493960943

python-3.x - 如何使用 VIPS 进行图像标准化？

1 回答 1

Related

Reference