python - 查找二维数组中接近元素的索引

Question

我有 2 个nd arrays，其中每一行都是一个3D点，一个数组比另一个数组大得多。
IE

array([[1., 2., 3.],
       [2.01, 5., 1.],
       [3., 3., 4.],
       [1., 4., 1.],
       [3., 6., 7.01]])

array([[3.02, 3.01, 4.0],
      [1.01, 1.99, 3.01],
      [2.98, 6.01, 7.01]])

而且我知道第二个数组中的每个点对应于第一个数组中的一个点。
我想获得对应索引的列表，
即对于这个例子，它将是

 array([2,0,4])

因为第二个数组中的第一个点类似于第一个数组中的第三个点，所以第二个数组中的第二个点类似于第一个数组中的第一个点，依此类推。

score 4 · Accepted Answer

您可以使用KDTree.

import numpy as np
from scipy.spatial import KDTree

x = np.array([[1., 2., 3.],
       [2.01, 5., 1.],
       [3., 3., 4.],
       [1., 4., 1.],
       [3., 6., 7.01]])

y = np.array([[1.01, 1.99, 3.01],
       [3.02, 3.01, 4.0],
       [2.98, 6.01, 7.01]])

result = KDTree(x).query(y)[1]

# In [16]: result                                                        
# Out[16]: array([0, 2, 4])

感谢 Divakar 指出scipy它还提供了一个 C 实现KDTree，称为cKDTree. 以下基准测试速度快 10 倍：

x = np.random.rand(100_000, 3)
y = np.random.rand(100, 3)

def benchmark(TreeClass):
    return TreeClass(x).query(y)[1]

In [23]: %timeit q.benchmark(KDTree)                                   
322 ms ± 7.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [24]: %timeit q.benchmark(cKDTree)                                  
36.5 ms ± 763 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

score 2 · Accepted Answer

我们可以将其中一个扩展到3D然后使用给定的容差参数（在给定的示例情况下似乎是 <= 0.2）比较与np.isclose()or的接近度，np.abs()<tolerance最后沿着最后一个轴获得ALL匹配并获得索引 -

In [88]: a
Out[88]: 
array([[1.  , 2.  , 3.  ],
       [2.01, 5.  , 1.  ],
       [3.  , 3.  , 4.  ],
       [1.  , 4.  , 1.  ],
       [3.  , 6.  , 7.01]])

In [89]: b
Out[89]: 
array([[3.02, 3.01, 4.  ],
       [1.01, 1.99, 3.01],
       [2.98, 6.01, 7.01]])

In [90]: r,c = np.nonzero(np.isclose(a[:,None],b, atol=0.02).all(2))

In [91]: r[c]
Out[91]: array([2, 0, 4])

python - 查找二维数组中接近元素的索引

2 回答 2

Related

Reference