python - keras 中的 ROI 增强：scipy.ndimage 转换

Question

我有一张带有感兴趣区域的图像。我想对该图像应用随机变换，同时保持感兴趣的区域正确。

我的代码采用这种格式的框列表[x_min, y_min, x_max, y_max]。然后它将盒子转换[up_left, up_right, down_right, down_left]为每个盒子的顶点列表。这是一个向量列表。所以我可以将转换应用于向量。

下一步是[x_min, y_min, x_max, y_max]在转换的顶点列表中寻找新的。

我的第一个应用程序是旋转，它们工作正常：

这是相应的代码。第一部分取自keras代码库，向下滚动到 NEW CODE 注释。如果我让代码工作，我会对将它集成到 keras 中感兴趣。所以我试图将我的代码集成到他们的图像预处理基础设施中：

def random_rotation_with_boxes(x, boxes, rg, row_axis=1, col_axis=2, channel_axis=0,
                    fill_mode='nearest', cval=0.):
    """Performs a random rotation of a Numpy image tensor. 
       Also rotates the corresponding bounding boxes

    # Arguments
        x: Input tensor. Must be 3D.
        boxes: a list of bounding boxes [xmin, ymin, xmax, ymax], values in [0,1].
        rg: Rotation range, in degrees.
        row_axis: Index of axis for rows in the input tensor.
        col_axis: Index of axis for columns in the input tensor.
        channel_axis: Index of axis for channels in the input tensor.
        fill_mode: Points outside the boundaries of the input
            are filled according to the given mode
            (one of `{'constant', 'nearest', 'reflect', 'wrap'}`).
        cval: Value used for points outside the boundaries
            of the input if `mode='constant'`.

    # Returns
        Rotated Numpy image tensor.
        And rotated bounding boxes
    """

    # sample parameter for augmentation
    theta = np.pi / 180 * np.random.uniform(-rg, rg)

    # apply to image
    rotation_matrix = np.array([[np.cos(theta), -np.sin(theta), 0],
                                [np.sin(theta), np.cos(theta), 0],
                                [0, 0, 1]])

    h, w = x.shape[row_axis], x.shape[col_axis]
    transform_matrix = transform_matrix_offset_center(rotation_matrix, h, w)
    x = apply_transform(x, transform_matrix, channel_axis, fill_mode, cval)
    
    
    # -------------------------------------------------
    # NEW CODE FROM HERE
    # -------------------------------------------------
    # apply to vertices
    vertices = boxes_to_vertices(boxes)
    vertices = vertices.reshape((-1, 2))

    # apply offset to have pivot point at [0.5, 0.5]
    vertices -= [0.5, 0.5]

    # apply rotation, we only need the rotation part of the matrix
    vertices = np.dot(vertices, rotation_matrix[:2, :2])
    vertices += [0.5, 0.5]

    boxes = vertices_to_boxes(vertices)

    return x, boxes, vertices

如您所见，它们scipy.ndimage用于将转换应用于图像。

我的边界框有坐标[0,1]，中心是[0.5, 0.5]。旋转需要围绕[0.5, 0.5]作为枢轴点应用。可以使用齐次坐标和矩阵来移动、旋转和移动向量。这就是他们为图像所做的。有一个现有的transform_matrix_offset_center功能，但偏移到float(width)/2 + 0.5. 这+0.5使得这不适合我在[0, 1]. 所以我自己移动向量。

对于旋转，此代码工作正常。我认为这将是普遍适用的。

但是对于缩放，这会以一种奇怪的方式失败。代码几乎相同：

vertices -= [0.5, 0.5]

# apply zoom, we only need the zoom part of the matrix
vertices = np.dot(vertices, zoom_matrix[:2, :2])
vertices += [0.5, 0.5]

输出是这样的：

似乎有各种各样的问题：

换档坏了。在图 1 中，ROI 和相应的图像部分几乎不重叠
坐标好像换了。在图 2 中，ROI 和图像似乎沿 x 轴和 y 轴缩放不同。

我尝试通过使用来切换轴(zoom_matrix[:2, :2].T)[::-1, ::-1]。这导致了这一点：

现在比例因子坏了？我在这个矩阵乘法、转置、镜像、更改比例因子等方面尝试了许多不同的变体。我似乎无法做到正确。

而且无论如何，我认为原始代码应该是正确的。毕竟，它适用于旋转。在这一点上，我在想这是否是 scipy 的 ndimage 重采样的一个特性？

这是我的数学错误，还是缺少真正模拟 scipy ndimage 重采样的东西？

我已将完整的源代码放在 pastebin 上。我只更新了小部分，实际上这是来自keras的代码： https ://pastebin.com/tsHnLLgy

使用新增强功能和创建这些图像的代码在这里： https ://nbviewer.jupyter.org/gist/lhk/b8f30e9f30c5d395b99188a53524c53e

更新：

如果缩放因子被反转，则转换起作用。对于缩放，这个操作很简单，可以表示为：

# vertices is an array of shape [number of vertices, 2]
vertices *= [1/zx, 1/zy]

这对应于对顶点应用逆变换。在图像重采样的背景下，这可能是有道理的。可以像这样重新采样图像

为每个像素创建一个坐标向量。
将逆变换应用于每个向量
插值原始图像以找到向量现在指向的值
将此值写入原始位置的输出图像

但是对于旋转，我没有反转矩阵并且操作正常。

问题本身，如何解决这个问题，似乎得到了回答。但我不明白为什么。

python - keras 中的 ROI 增强：scipy.ndimage 转换

0 回答 0

Related

Reference