python - 如何加速处理图像中的大量补丁？

Question

我编写了一个函数来处理图像，在其中我提取了许多补丁，然后使用相同的函数（func）处理它们以生成新图像。但是，这非常慢，因为有两个循环，func，补丁的数量，补丁的大小。我不知道如何加速这段代码。

功能如下。

# code1
def filter(img, func, ksize, strides=1):
    height,width = img.shape
    f_height,f_width = ksize
    new_height = height - f_height + 1
    new_width = width - f_width + 1

    new_img = np.zeros((new_height,new_width))

    for i in range(new_height):
        for j in range(new_width):
            patch = img[i:i+f_height,j:j+f_width]
            new_img[i][j] = func(patch)

    return new_img

func 可以非常灵活且耗时。我以一个为例。下面的函数要计算补丁的中心点除以补丁的中位数。但是，我不希望那些值为 255 的像素计算中位数（255 是无效像素的默认值）。所以我在 numpy 中使用掩码数组。屏蔽数组使代码变慢了好几次，我不知道如何优化它。

# code2
def relative_median_and_center_diff(patch, in_the_boundary, rectangle, center_point):
        mask = patch == 255
        mask[center_point] = True
        masked_patch = np.ma.array(patch, mask=mask)
        count = masked_patch.count()
        if count <= 1:
            return 0
        else:
            return patch[center_point]/(np.ma.median(masked_patch)+1)

我尝试过或得到的想法：

我在循环之前使用了一些 numpy 函数来提取补丁，期望这可以比patch = img[i:i+f_height,j:j+f_width]. 我找到了从python中有效地从图像中提取特定大小的补丁中提取补丁的函数起初我尝试了skimage.util.shape中的view_as_windows。代码已更改，如下所示。这比 code1 需要更多的时间。我还尝试了 sklearn.feature_extraction.image.extract_patches_2d 并发现这比 code3 快，但仍然比 code1 慢。（谁能告诉我为什么会这样？）

# code3
def filter(img, func, ksize, strides=1):
    height,width = img.shape
    f_height,f_width = ksize
    new_height = height - f_height + 1
    new_width = width - f_width + 1

    new_img = np.zeros((new_height,new_width))

    from skimage.util.shape import view_as_windows
    patches = view_as_windows(img, (f_height,f_width))

    for i in range(new_height):
        for j in range(new_width):
            patch = patches[i,j]
            new_img[i][j] = func(patch)

    return new_img

这个操作有点像卷积或滤波器，除了func。我想知道那些lib如何处理这个问题，你们能给我一些线索吗？
在这种情况下我们可以避免两个循环吗？也许这可以加速代码。
我有显卡。我可以更改代码以在 gpus 上运行它并使其并行处理补丁以使其更快吗？
将代码更改为 C。这是我想做的最后一件事，因为这可能有点乱。

大家能给我一些想法或建议吗？

score 0 · Accepted Answer

如果您的计算机有多个 CPU，您可以通过将该进程提交给一个ThreadPoolExecutor

您的代码应如下所示：

from concurrent.futures import ThreadPoolExecutor
from multiprocessing import cpu_count()

executor = ThreadPoolExecutor(max_workers=cpu_count())
future = executor.submit(func, data, *args)
future_to_item[future] = data

for future in concurrent.futures.as_completed(future_to_item):
    # do something when you get the result

我一直使用 ThreadPoolExecutor 进行图像处理。

由于我们只有功能并且不知道您的程序（完全）如何工作，请查看 Python 中的并发性，以便您更好地了解如何将其集成到您的代码中：https ://docs.python.org/ 3/库/concurrent.futures.html

python - 如何加速处理图像中的大量补丁？

1 回答 1

Related

Reference