当您想将这些点聚集在一起时,我假设您打算使用Chebyshev 距离。
在这种情况下,最直接的方法是使用Union Find 数据结构。
这是我使用的一个实现:
class UnionFind:
"""Union-find data structure. Items must be hashable."""
def __init__(self):
"""Create a new empty union-find structure."""
self.weights = {}
self.parents = {}
def __getitem__(self, obj):
"""X[item] will return the token object of the set which contains `item`"""
# check for previously unknown object
if obj not in self.parents:
self.parents[obj] = obj
self.weights[obj] = 1
return obj
# find path of objects leading to the root
path = [obj]
root = self.parents[obj]
while root != path[-1]:
path.append(root)
root = self.parents[root]
# compress the path and return
for ancestor in path:
self.parents[ancestor] = root
return root
def union(self, obj1, obj2):
"""Merges sets containing obj1 and obj2."""
roots = [self[obj1], self[obj2]]
heavier = max([(self.weights[r],r) for r in roots])[1]
for r in roots:
if r != heavier:
self.weights[heavier] += self.weights[r]
self.parents[r] = heavier
然后编写函数groupTPL
很容易:
def groupTPL(TPL, distance=1):
U = UnionFind()
for (i, x) in enumerate(TPL):
for j in range(i + 1, len(TPL)):
y = TPL[j]
if max(abs(x[0] - y[0]), abs(x[1] - y[1])) <= distance:
U.union(x, y)
disjSets = {}
for x in TPL:
s = disjSets.get(U[x], set())
s.add(x)
disjSets[U[x]] = s
return [list(x) for x in disjSets.values()]
在您的设备上运行它会产生:
>>> groupTPL([(1, 1), (2, 1), (3, 2), (7, 5), (2, 7), (6, 4), (2, 3), (2, 6), (3, 1)])
[
[(2, 7), (2, 6)],
[(6, 4), (7, 5)],
[(3, 2), (3, 1), (2, 3), (1, 1), (2, 1)]
]
然而,这个实现虽然简单,但仍然是O(n^2)
. 如果点的数量变得非常大,一个有效的实现将使用kd 树。