python - 如何删除numpy数组中的特定元素

Question

如何从 numpy 数组中删除某些特定元素？说我有

import numpy as np

a = np.array([1,2,3,4,5,6,7,8,9])

然后我想3,4,7从a. 我所知道的是值的索引（index=[2,3,6]）。

score 383 · Accepted Answer

使用numpy.delete() - 返回一个新数组，其中删除了沿轴的子数组

numpy.delete(a, index)

对于您的具体问题：

import numpy as np

a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
index = [2, 3, 6]

new_a = np.delete(a, index)

print(new_a) #Prints `[1, 2, 5, 6, 8, 9]`

请注意，由于数组标量numpy.delete()是不可变的，因此返回一个新数组，类似于 Python 中的字符串，因此每次对其进行更改时，都会创建一个新对象。即，引用文档：delete()

“删除了 obj 指定的元素的 arr副本。请注意，删除不会就地发生......”

如果我发布的代码有输出，就是运行代码的结果。

score 95 · Accepted Answer

有一个 numpy 内置函数可以帮助解决这个问题。

import numpy as np
>>> a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b = np.array([3,4,7])
>>> c = np.setdiff1d(a,b)
>>> c
array([1, 2, 5, 6, 8, 9])

score 44 · Accepted Answer

Numpy 数组是不可变的，这意味着您在技术上无法从中删除项目。但是，您可以构造一个没有不需要的值的新数组，如下所示：

b = np.delete(a, [2,3,6])

score 30 · Accepted Answer

按值删除：

modified_array = np.delete(original_array, np.where(original_array == value_to_delete))

score 9 · Accepted Answer

np.delete如果我们知道要删除的元素的索引，则使用是最快的方法。但是，为了完整起见，让我添加另一种“删除”数组元素的方法，使用在np.isin. 此方法允许我们通过直接指定元素或通过它们的索引来删除元素：

import numpy as np
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])

按索引删除：

indices_to_remove = [2, 3, 6]
a = a[~np.isin(np.arange(a.size), indices_to_remove)]

按元素删除（不要忘记重新创建原始a内容，因为它在上一行中被重写）：

elements_to_remove = a[indices_to_remove]  # [3, 4, 7]
a = a[~np.isin(a, elements_to_remove)]

score 5 · Accepted Answer

我不是一个麻木的人，我拍了一张照片：

>>> import numpy as np
>>> import itertools
>>> 
>>> a = np.array([1,2,3,4,5,6,7,8,9])
>>> index=[2,3,6]
>>> a = np.array(list(itertools.compress(a, [i not in index for i in range(len(a))])))
>>> a
array([1, 2, 5, 6, 8, 9])

根据我的测试，这优于numpy.delete(). 我不知道为什么会这样，可能是由于初始数组的大小很小？

python -m timeit -s "import numpy as np" -s "import itertools" -s "a = np.array([1,2,3,4,5,6,7,8,9])" -s "index=[2,3,6]" "a = np.array(list(itertools.compress(a, [i not in index for i in range(len(a))])))"
100000 loops, best of 3: 12.9 usec per loop

python -m timeit -s "import numpy as np" -s "a = np.array([1,2,3,4,5,6,7,8,9])" -s "index=[2,3,6]" "np.delete(a, index)"
10000 loops, best of 3: 108 usec per loop

这是一个非常显着的差异（与我的预期相反），有人知道为什么会这样吗？

更奇怪的是，传递numpy.delete()一个列表比遍历列表并给它单个索引更糟糕。

python -m timeit -s "import numpy as np" -s "a = np.array([1,2,3,4,5,6,7,8,9])" -s "index=[2,3,6]" "for i in index:" "    np.delete(a, i)"
10000 loops, best of 3: 33.8 usec per loop

编辑：它似乎与数组的大小有关。使用大型阵列，numpy.delete()速度明显更快。

python -m timeit -s "import numpy as np" -s "import itertools" -s "a = np.array(list(range(10000)))" -s "index=[i for i in range(10000) if i % 2 == 0]" "a = np.array(list(itertools.compress(a, [i not in index for i in range(len(a))])))"
10 loops, best of 3: 200 msec per loop

python -m timeit -s "import numpy as np" -s "a = np.array(list(range(10000)))" -s "index=[i for i in range(10000) if i % 2 == 0]" "np.delete(a, index)"
1000 loops, best of 3: 1.68 msec per loop

显然，这一切都无关紧要，因为您应该始终保持清晰并避免重新发明轮子，但我发现它有点有趣，所以我想我会把它留在这里。

score 2 · Accepted Answer

如果您没有要删除的元素的索引，您可以使用numpy 提供的in1d函数。

True如果一维数组的元素也存在于第二个数组中，则该函数返回。要删除元素，您只需否定此函数返回的值。

请注意，此方法保留原始数组的顺序。

In [1]: import numpy as np

        a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
        rm = np.array([3, 4, 7])
        # np.in1d return true if the element of `a` is in `rm`
        idx = np.in1d(a, rm)
        idx

Out[1]: array([False, False,  True,  True, False, False,  True, False, False])

In [2]: # Since we want the opposite of what `in1d` gives us, 
        # you just have to negate the returned value
        a[~idx]

Out[2]: array([1, 2, 5, 6, 8, 9])

score 1 · Accepted Answer

如果不知道索引，则无法使用logical_and

x = 10*np.random.randn(1,100)
low = 5
high = 27
x[0,np.logical_and(x[0,:]>low,x[0,:]<high)]

score 1 · Accepted Answer

删除特定索引（我从矩阵中删除了 16 和 21）

import numpy as np
mat = np.arange(12,26)
a = [4,9]
del_map = np.delete(mat, a)
del_map.reshape(3,4)

输出：

array([[12, 13, 14, 15],
      [17, 18, 19, 20],
      [22, 23, 24, 25]])

score 1 · Accepted Answer

列表理解也可能是一种有趣的方法。

a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
index = np.array([2, 3, 6]) #index is changed to an array.  
out = [val for i, val in enumerate(a) if all(i != index)]
>>> [1, 2, 5, 6, 8, 9]

score 0 · Accepted Answer

您还可以使用集合：

a = numpy.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
the_index_list = [2, 3, 6]

the_big_set = set(numpy.arange(len(a)))
the_small_set = set(the_index_list)
the_delta_row_list = list(the_big_set - the_small_set)

a = a[the_delta_row_list]

score -1 · Accepted Answer

过滤不需要的部分：

import numpy as np
a = np.array([1,2,3,4,5,6,7,8,9])
a = a[(a!=3)&(a!=4)&(a!=7)]

如果您有要删除的索引列表：

to_be_removed_inds = [2,3,6]
a = np.array([1,2,3,4,5,6,7,8,9])
a = a[[x for x in range(len(a)) if x not in to_be_removed]]

python - 如何删除numpy数组中的特定元素

12 回答 12

Related

Reference