加快线搜索算法的另一种方法是预先计算每条线的起点,然后应用昂贵的逻辑从这些点中的每一个计算线。
我对逻辑的看法有限(因为您没有提供完整的行识别逻辑),它可以计算快速矢量化代码中的起点。
能够在快速矢量化代码中实现这样的事情的第一步是能够找出哪些点在一条线上,但它们上面的直接点不是:
import numpy
# using the array that was provided in the question
a = """0 x1 0 0 y1 0 z1
0 0 x2 0 y2 0 z2
0 0 x3 0 0 y3 z3
0 0 x4 0 0 y4 z4
0 x5 0 0 0 y5 z5
0 0 0 0 y6 0 0"""
array = numpy.array([int(v.strip()) if v.strip().isdigit() else i for i, v in enumerate(a.split(' '))]).reshape(6, 7)
结果在一个数组中,如下所示:
>>> print repr(array)
array([[ 0, 1, 0, 0, 4, 0, 6],
[ 0, 0 9, 0, 11, 0, 13],
[ 0, 0, 16, 0, 0, 19, 20],
[ 0, 0, 23, 0, 0, 26, 27],
[ 0, 29, 0, 0, 0, 33, 34],
[ 0, 0, 0, 0, 39, 0, 0]])
从这里,我们可以做一些 numpy 滚动:
>>> print `numpy.roll(array, 1, axis=0)`
array([[ 0, 0, 0, 0, 39, 0, 0],
[ 0, 1, 0, 0, 4, 0, 6],
[ 0, 0, 9, 0, 11, 0, 13],
[ 0, 0, 16, 0, 0, 19, 20],
[ 0, 0, 23, 0, 0, 26, 27],
[ 0, 29, 0, 0, 0, 33, 34]])
可以将其组合起来为我们提供线条的垂直起点:
>>> potential_start_points = (array != 0) & (numpy.roll(array, 1, axis=0) == 0)
>>> # include the top row points, as they are certainly start points
>>> potential_start_points[0, :] = (array != 0)[0, :]
>>> print `potential_start_points`
array([[False, True, False, False, True, False, True],
[False, False, True, False, False, False, False],
[False, False, False, False, False, True, False],
[False, False, False, False, False, False, False],
[False, True, False, False, False, False, False],
[False, False, False, False, True, False, False]], dtype=bool)
从这里开始,可以改进矢量化逻辑以挑选对角线等,但我很想迭代每个 True 并应用更复杂的基于索引的逻辑。
xs, ys = numpy.where(potential_start_points)
for x, y in zip(xs, ys):
# do more complex logic here ...
毕竟,在这种情况下,问题现在从迭代 6x7=42 个数字减少到仅迭代 7 个。