python - 找到接近目标的所有值，如 numpy.searchsorted() 但返回所有相同的值？

Question

有没有什么好的方法可以在靠近多个目标的排序数组 A中找到所有值索引？使用 numpy.searchsorted() 可以让我们高效地找到靠近多个目标的索引：在 Python 中查找最接近的值并返回数组的索引但是，如果数组 A 中有重复的值。此方法将仅返回索引的 1而不是所有可能的索引。例如这样的数组：

A = array([    1. ,     2. ,     3. ,     3. ,     3.1,     4. ,    50. ,
          60. ,    70. ,    80. ,    90. ,   100.1,   110. ,   120. ,
         999. ,  1000. ])
targets=[3, 100]

它将返回 idx = [2, 11] 但我希望它返回 [[2,3],11] 我可以做的只是循环遍历 idx 以获取布尔索引，如 [A==A[idx[0] ],A==A[idx[1]],...] 但是如果目标数组非常大，这可能会非常低效。

有一件事是我可以首先使用 numpy.unique() 找到数组的唯一集。找到所有相同的值。然后在该唯一数组上搜索排序（），这可能会节省一些时间。然后我可以使用这个索引来查找所有相同的值。

这是一个例子：

def find_closest_multiTargets_inSortred(A,targets):
        #A must be sorted
    idx = A.searchsorted(targets)
    idx = npy.clip(idx, 1, len(A)-1)
    left = A[idx-1]
    right = A[idx]
    idx -= targets - left < right - targets
    return idx

def find_closest_multiTargets_Allrepeats(A,targets):
    ua=npy.unique(A)
    _uaIdxs=find_closest_multiTargets_inSortred(ua, targets)
    return [npy.where(A==ua[_i]) for _i in _uaIdxs]

>>> find_closest_multiTargets_Allrepeats([5.1,5.5,4,1,2.3,5.1,6],[2,5])
[(array([4]),), (array([0, 5]),)]

我认为，如果len(ua)<<len(A)它比尝试直接在 A 上找到最接近的要高效得多。但是，npy.where 步骤仍在循环通过 _uaIdxs，如果它很大，那么它的效率将非常低。如果可以构建一个替代 unique()，为 A 中的每个唯一值获取索引列表（[[indices has value ua[0]]，[indices has value ua[2]]...]）。它会更有效率：

def find_closest_multiTargets_Allrepeats2(A,targets):
    ua,idxList=npy.unique2(A)
    _uaIdxs=find_closest_multiTargets_inSortred(ua, targets)
    return idxList[_uaIdxs]

但我不知道是否有什么可以做unique2()预期做的事情。除了 searchsorted 之外，可能还有其他完全不同的算法可以以更有效的方式获得相同的结果。

为简单起见，我们假设 A 已排序。对于未排序的数组 A，我们总是可以先对其进行 argsort 排序。

有没有人可以提供一种更有效的方法来做到这一点？

谢谢！

score 1 · Accepted Answer

您可以执行以下操作：

a = np.array([1., 2., 3., 3., 3.1, 4., 50., 60., 70., 80., 90., 100.1, 110., 120., 999., 1000.])
t = np.array([3, 100])

计算成对距离：
```
d = np.abs(np.subtract.outer(a, t))
```
找到最接近的值：
```
asort = np.argsort(d, axis=0)
```

获取最接近的索引和最接近的值：

ind = np.arange(a.shape[0])
print(ind[asort][0])
#array([ 2, 11], dtype=int64)

print(a[asort][0])
#array([   3. ,  100.1])

请注意，如果您使用除最后一步之外的其他索引，您将获得第 i 个最接近的值...使用[i]将产生最接近的值。[0][0]

score 0 · Accepted Answer

0

numpy.in1d(A, idx) 会做你想做的事。

于 2014-07-01T20:57:31.397 回答

python - 找到接近目标的所有值，如 numpy.searchsorted() 但返回所有相同的值？

2 回答 2

Related

Reference