1

我得到了一个由 0 和 1 组成的数组。如图所示,1 形成了连续的簇。

聚类

集群的数量事先不知道。

有没有办法创建一个包含所有集群位置的列表,或者为每个集群创建一个包含其所有成员位置的列表。例如:

cluster_list = continuous_cluster_finder(data_array)
cluster_list[0] = [(pixel1_x, pixel1_y), (pixel2_x, pixel2_y),...]
4

2 回答 2

2

从描述中不清楚问题的确切限制是什么。假设您可以通过左、右、上、下的零来区分一个集群,那么以下解决了这个问题......

#!/usr/bin/env python

data = [ #top-left
         [0,0,1,1,0,0],
         [0,0,1,1,0,0],
         [1,1,0,0,1,1],
         [1,1,0,0,1,1],
         [0,0,1,1,0,0],
         [0,0,1,1,0,0],
         [1,1,0,0,1,1],
         [1,1,0,0,1,1],
       ]             # bottom-right

d = {} # point --> clid
dcl = {} # clid --> [point1,point2,...]

def process_point(t):
    global clid # cluster id
    val = data[t[0]][t[1]]
    above = (t[0]-1, t[1])
    abovevalid = 0 <= above[0] < maxX and 0 <= above[1] < maxY
    #below = (t[0]+1, t[1]) # We do not need that because we scan from top-left to bottom-right
    left = (t[0], t[1]-1)
    leftvalid = 0 <= left[0] < maxX and 0 <= left[1] < maxY
    #right = (t[0], t[1]+1) # We do not need that because we scan from top-left to bottom-right

    if not val: # for zero return
        return
    if left in d and above in d and d[above] != d[left]:
        # left and above on different clusters, merge them
        prevclid = d[left]
        dcl[d[above]].extend(dcl[prevclid]) # update dcl
        for l in dcl[d[left]]:
            d[l] = d[above] # update d
        del dcl[prevclid]
        dcl[d[above]].append(t)
        d[t] = d[above]
    elif above in d and abovevalid:
        dcl[d[above]].append(t)
        d[t] = d[above]
    elif left in d and leftvalid:
        dcl[d[left]].append(t)
        d[t] = d[left]
    else: # First saw this one 
        dcl[clid] = [t]
        d[t] = clid
        clid += 1

def print_output():
    for k in dcl: # Print output
        print k, dcl[k]

def main():
    global clid
    global maxX
    global maxY
    maxX = len(data)
    maxY = len(data[0])
    clid = 0
    for i in xrange(maxX):
        for j in xrange(maxY):
            process_point((i,j))
    print_output()

if __name__ == "__main__":
    main()

它打印...

0 [(0, 2), (0, 3), (1, 2), (1, 3)]
1 [(2, 0), (2, 1), (3, 0), (3, 1)]
2 [(2, 4), (2, 5), (3, 4), (3, 5)]
3 [(4, 2), (4, 3), (5, 2), (5, 3)]
4 [(6, 0), (6, 1), (7, 0), (7, 1)]
5 [(6, 4), (6, 5), (7, 4), (7, 5)]
于 2012-12-12T04:17:55.427 回答
1

您可以查看一个众所周知的“blob”查找算法,该算法用于图像处理以隔离相同颜色的区域。您还可以通过找到岛屿并将它们标记为已访问(而所有这些岛屿在开始时都未访问)来酿造自己的口味;所有连接(在 3x3 网格中,中心像素为 8 连通性)和访问像素形成一个区域;您需要在地图中找到所有此类区域。

Blob 查找是您需要查找的内容。

于 2012-12-12T03:22:11.367 回答