python - 从可能不完整的候选列表构建二维网格

Question

问题

我需要使用一组候选位置（X和中的值Y）构建一个二维网格。但是，可能存在应该过滤掉的误报候选，以及误报（在给定周围位置值的情况下，需要为预期位置创建位置）。可以预期网格的行和列是直的，并且旋转（如果有的话）很小。

此外，我没有关于 (0, 0) 网格位置的可靠信息。不过我知道：

grid_size = (4, 4)

expected_distance = 105

（除外距离只是对网格点间距的粗略估计，应允许在 10% 的范围内变化）。

示例数据

这是理想的数据，没有误报，也没有误报。该算法需要能够处理删除多个数据点并添加错误的数据点。

X = np.array([61.43283582, 61.56626506, 62.5026738,   65.4028777, 167.03030303, 167.93965517, 170.82191781, 171.37974684, 272.02884615, 272.91089109, 274.1031746, 274.22891566, 378.81553398, 379.39534884, 380.68181818, 382.67164179])

Y = np.array([55.14427861, 160.30120482, 368.80213904, 263.12230216, 55.1030303, 263.64655172, 162.67123288, 371.36708861, 55.59615385, 264.64356436, 368.20634921, 158.37349398, 54.33980583, 160.55813953,  371.72727273,  266.68656716])

代码

以下函数评估候选并返回两个字典。

第一个具有每个候选位置（作为 2 长度元组）作为键和值是相邻位置右侧和下方的 2 长度元组（使用图像显示方式的逻辑）。这些邻居本身要么是 2 长度的元组坐标，要么是None.

第二个字典是第一个字典的反向查找，这样每个候选人（职位）都有一个支持它的其他候选人职位的列表。

import numpy as np
from collections import defaultdict

def get_neighbour_grid(X, Y, expect_dist=(105, 105)):

    t1 = (expect_dist[0] + expect_dist[1]) / 2.0 * 0.9
    t2 = t1 * 1.222

    def neighbours(x, y):

        nRight = None
        ideal = x + expect_dist[0]
        D = np.sqrt((X - ideal)**2 + (Y - y)**2)
        candidate = (X[D.argmin()], Y[D.argmin()])
        if candidate != (x, y) and x + t2 > candidate[0] > x + t1:
            nRight = candidate

        nBelow = None
        ideal = y + expect_dist[0]
        D = np.sqrt((X - x)**2 + (Y - ideal)**2)
        candidate = (X[D.argmin()], Y[D.argmin()])
        if candidate != (x, y) and y + t2 > candidate[1] > y + t1:
            nBelow = candidate

        return nRight, nBelow

    right_below_neighbours = dict()
    def _default_val(*args):
        return list()
    reverse_lookup = defaultdict(_default_val)

    for pos in np.arange(X.size):

        pos_tuple = (X[pos], Y[pos])
        n  = neighbours(*pos_tuple)
        right_below_neighbours[pos_tuple] = n
        reverse_lookup[n[0]].append(pos_tuple)
        reverse_lookup[n[1]].append(pos_tuple)

    return right_below_neighbours, reverse_lookup

这是我卡住的地方：

如何使用这些字典和/或X构建Y最受支持的网格？

我有一个想法，从 2 个邻居支持的最右下方的候选者开始，并使用reverse_lookup字典迭代地创建网格。但是这种设计有几个缺陷，最明显的是我不能指望检测到较低、最右边的候选人及其支持的邻居。

那个代码，虽然它不会运行，因为当我意识到它有多大问题时我放弃了它（pre_grid = right_below_neighbours）：

def build_grid(pre_grid, reverse_lookup, grid_shape=(4, 4)):

    def _default_val(*args):
        return 0

    grid_pos_support = defaultdict(_default_val)
    unsupported = 0

    for l, b in pre_grid.values():

        if l is not None:
            grid_pos_support[l] += 1
        else:
            unsupported += 1
        if b is not None:
            grid_pos_support[b] += 1
        else:
            unsupported += 1

    well_supported = list()
    for pos in grid_pos_support:
        if grid_pos_support[pos] >= 2:
            well_supported.append(pos)

    well_A = np.asarray(well_supported)
    ur_pos = well_A[well_A.sum(axis=1).argmax()]

    grid = np.zeros(grid_shape + (2,), dtype=np.float)
    grid[-1,-1,:] = ur_pos

    def _iter_build_grid(pos, ref_pos=None):

        isX = pre_grid[tuple(pos)][0] == ref_pos
        if ref_pos is not None:
            oldCoord = map(lambda x: x[0], np.where(grid == ref_pos)[:-1])
            myCoord = (oldCoord[0] - int(isX), oldCoord[1] - int(not isiX))

        for p in reverse_lookup[tuple(pos)]:

            _iter_build_grid(p, pos)

    _iter_build_grid(ur_pos)

    return grid

第一部分可能很有用，因为它总结了每个位置的支持。它还显示了我需要的最终输出（grid）：

一个 3D 数组，第一个维度是网格的形状，第三个维度是长度为 2（每个位置的 x 坐标和 y 坐标）。

回顾

所以我意识到我的尝试是如何无用的，但我不知道如何对所有候选人进行全局评估，并使用候选人的 x 和 y 值放置最受支持的网格。因为这是一个非常复杂的问题，我真的不希望有人给出一个完整的解决方案（虽然它会很棒），但是任何关于可以使用什么类型的算法或 numpy/scipy 函数的提示都会不胜感激。

最后，很抱歉这是一个有点冗长的问题。

编辑

绘制我想要发生的事情：

它应该如何工作的草图

星/点是X并Y绘制了两个修改，我删除了第一个位置并添加了一个错误的位置，以使其成为所寻求算法的完整示例。

换句话说，我想要的是映射红圈位置的新坐标值（写在它们旁边的坐标值），以便我可以从新坐标（例如(1, 1) -> (170.82191781, 162.67123288)）中获取旧坐标。我还希望丢弃不接近真实点描述的理想网格的点（如图所示），最后使用理想网格参数（大致）“填充”空的理想网格位置（蓝色圆圈(0, 0) -> (55, 55)）。

解决方案

我使用@skymandr 提供的代码来获取理想的参数，然后执行以下操作（不是最漂亮的代码，但它有效）。这意味着我不再使用get_neighbour_grid-function。：

def build_grid(X, Y, x_offset, y_offset, dx, dy, grid_shape=(16,24),
    square_distance_threshold=None):

    if square_distance_threshold is None:
        square_distance_threshold = ((dx + dy) / 2.0 * 0.05) ** 2

    grid = np.zeros(grid_shape + (2,), dtype=np.float)

    D = np.zeros(grid_shape)
    for i in range(grid_shape[0]):
        for j in range(grid_shape[1]):
            D[i,j] = i * (1 + 1.0 / (grid_shape[0] + 1)) + j

    rD = D.ravel().copy()
    rD.sort()

    def find_valid(x, y):

        d = (X - x) ** 2 + (Y - y) ** 2
        valid = d < square_distance_threshold
        if valid.any():
            pos = d == d[valid].min()
            if pos.sum() == 1:
                return X[pos], Y[pos]

        return x, y

    x = x_offset
    y = y_offset
    first_loop = True

    for v in rD:
        #get new position
        coord = np.where(D == v)

        #generate a reference position already passed
        if coord[0][0] > 0:
            old_coord = (coord[0] - 1, coord[1])
        elif coord[1][0] > 0:
            old_coord = (coord[0], coord[1] - 1)

        if not first_loop:
            #calculate ideal step
            x, y = grid[old_coord].ravel()
            x += (coord[0] - old_coord[0]) * dx
            y += (coord[1] - old_coord[1]) * dy

        #modify with observed point close to ideal if exists
        x, y = find_valid(x, y)

        #put in grid
        #print coord, grid[coord].shape
        grid[coord] = np.array((x, y)).reshape(grid[coord].shape)

        first_loop = False


    return grid

它提出了另一个问题：如何很好地沿着 2D 数组的对角线进行迭代，但我认为这本身就值得提出一个问题：More numpy way of iterate through the 'orthogonal'对角线的 2D 数组

编辑

更新了解决方案代码以更好地处理更大的网格大小，以便它使用已传递的相邻网格位置作为所有位置的理想坐标的参考。仍然必须找到一种方法来实现从链接问题中遍历网格的更好方法。

score 5 · Accepted Answer

这是一个相当简单且便宜的解决方案，尽管我不知道它有多强大。

首先，这是一种更好地估计间距的方法：

leeway = 1.10

XX = X.reshape((1, X.size))
dX = np.abs(XX - XX.T).reshape((1, X.size ** 2))
dxs = dX[np.where(np.logical_and(dX > expected_distance / leeway,
                                 dX < expected_distance * leeway))]
dx = dxs.mean()

YY = Y.reshape((1, Y.size))
dY = np.abs(YY - YY.T).reshape((1, Y.size ** 2))
dys = dY[np.where(np.logical_and(dY > expected_distance / leeway,
                                 dY < expected_distance * leeway))]
dy = dys.mean()

该代码计算 X 和 Y 的内部差异，并取在所需间距 10% 以内的那些的平均值。

对于第二部分，找到网格的偏移量，可以使用类似的方法：

Ndx = np.array([np.arange(grid_size[0])]) * dx
x_offsets = XX - Ndx.T
x_offset = np.median(x_offsets)

Ndy = np.array([np.arange(grid_size[1])]) * dy
y_offsets = YY - Ndy.T
y_offset = np.median(y_offsets)

从本质上讲，这样做是让每个位置X“投票”为NX = grid_size[0]左下点可能所在的位置，基于对X - n * dx点n = 0本身n = 1的投票，是dx对左点的投票等。这样，靠近真实原点的点将获得最多的选票，并且可以使用中位数找到偏移量。

我认为这种方法在所需原点周围足够对称，因此中位数可以用于大多数（如果不是全部）情况。然而，如果存在许多误报，导致中位数由于某种原因不起作用，则可以使用例如直方图方法找到“真实”原点。

python - 从可能不完整的候选列表构建二维网格

1 回答 1

Related

Reference