python - 从字典列表中获取最近的元素

Question

我的程序生成以下列表（摘录）：

my_list = [{'x': 1764, 'y': 18320, 'class': 'note', 'id': 'd1e2443'},
           {'x': 1764, 'y': 20030, 'class': 'note', 'id': 'd1e2591'},
           {'x': 1807, 'y': 12650, 'class': 'note', 'id': 'd1e1362'},
           {'x': 2243, 'y': 20120, 'class': 'note', 'id': 'd1e2609'},
           {'x': 2243, 'y': 22685, 'class': 'note', 'id': 'd1e2769'},
           {'x': 2257, 'y': 12560, 'class': 'note', 'id': 'd1e1380'},
           {'x': 2688, 'y': 20210, 'class': 'note', 'id': 'd1e2625'},
           {'x': 2707, 'y': 10040, 'class': 'note', 'id': 'd1e1194'},
           {'x': 2707, 'y': 12650, 'class': 'note', 'id': 'd1e1398'},
           {'x': 2707, 'y': 14720, 'class': 'note', 'id': 'd1e1571'},
           {'x': 2901, 'y': 18140, 'class': 'note', 'id': 'd1e2475'}]

它已经按“x”键的值排序。我正在尝试编写一个方法，该方法针对给定坐标返回此列表的两个元素的元组(xPos, yPos)：

左边最近的元素 ( x <= xPos)
最右边的元素 ( x > xPos)

距离就是欧几里得距离（“毕达哥拉斯”）。该函数的第二个参数是允许的最大距离：

def getNearest(noteList, posX, posY, maxDistance):
    [...]
    return leftElement, rightElement

我尝试使用 bisect 函数分别获取最近元素的插入点xPos以及 for xPos - maxDistance(case 'left') 和xPos + maxDistance(case 'right) 以缩小搜索区域。然后我计算了这个切片列表中每个剩余元素的距离

不知何故，这感觉非常不雅。有没有更好的方法来做到这一点？

编辑： 也许我的意图不是很清楚：我需要列表的两个元素。“2D 窗格”中最靠近左侧和右侧的元素。因此，我还需要考虑 y 坐标。

可能会发生（实际上几乎每次），就其 x 坐标而言最近的元素比具有接近 y 坐标的元素更远。

score 1 · Accepted Answer

这似乎是一个很好的解决方案，但从你描述的方式来看，我认为不需要做更多的事情。

bisect_left已经返回元素的索引，以满足您的第一个条件 ( x <= xPos - maxDistance)。从那里，我只需将元素一个一个地迭代到右边，直到你到达x > xPos + maxDistance.

考虑到 CPU 缓存，这可能会产生更好的性能，因为您正在迭代相邻位置而不是四处跳动

如果你开始处理大量的点或maxDistance，这个算法可能不够用。在这种情况下，您应该按照 wenzul 的建议考虑使用kd-tree 。

score 1 · Accepted Answer

关于欧几里得距离，您可以使用它min()来查找两侧最近的元素。posX

>>>import math
>>>def getNearest(noteList, posX, posY):
...    leftElement = min([i for i in my_list if i['x'] <= posX], key=lambda x:abs(math.sqrt((x['x']- posX)**2+(x['y']- posY)**2)))
...    rightElement = min([i for i in my_list if i['x'] > posX], key=lambda x:abs(math.sqrt((x['x']- posX)**2+(x['y']- posY)**2)))
...    return (leftElement, rightElement)


>>> getNearest(my_list, 2000, 2000)
({'y': 12650, 'x': 1807, 'class': 'note', 'id': 'd1e1362'}, {'y': 10040, 'x': 2707, 'class': 'note', 'id': 'd1e1194'})

>>> getNearest(my_list, 2000, 20000)
({'y': 20030, 'x': 1764, 'class': 'note', 'id': 'd1e2591'}, {'y': 20120, 'x': 2243, 'class': 'note', 'id': 'd1e2609'})

其中Key是 2D 窗格中每个元素与传递给函数的元素之间的欧几里得距离，即(posX, PosY)。

score 0 · Accepted Answer

我试图将我最初的想法与答案中的一些建议合并。这就是我想出的：

class translatedDictList(object):
    def __init__(self, dictList, key):
        self.dictList = dictList
        self.key = key

    def __getitem__(self, item):
        return self.dictList[item][self.key]

    def __len__(self):
        return self.dictList.__len__()

def getNearest(self, symbolList, pos, maxDistance):
    translatedList = translatedDictList(symbolList, 'x')

    splitIndex = bisect.bisect(translatedList, pos[0])
    minIndex = bisect.bisect(translatedList, pos[0] - maxDistance)
    maxIndex = bisect.bisect(translatedList, pos[0] + maxDistance)

    # Euclidean distance acutally not needed anymore!
    leftElement = min(symbolList[minIndex:splitIndex],
                      key=lambda n: abs(n['x'] - pos[0]) +
                                    abs(n['y'] - pos[1]))
    rightElement = min(symbolList[splitIndex:maxIndex],
                       key=lambda n: abs(n['x'] - pos[0]) +
                                     abs(n['y'] - pos[1]))

    return leftElement, rightElement

print(getNearest(self.symbolsSorted, (1200, 1500), 1000))

也许有一种更聪明的方式来翻译symbolList以便使用bisect()？

它应该是o(log*n)，据我所知，我什至不再需要计算欧几里德距离，因为我只是比较而不对实际距离感兴趣。

python - 从字典列表中获取最近的元素

3 回答 3

Related

Reference