python - 不包含在其他数组列表中的所有数组的列表

Question

我有一个二维点列表，表示为两个元素长列表/数组。例如：

points = 
      [[ 10.       ,  10.       ],
       [ 11.       ,  10.       ],
       [ 10.5      ,   9.1339746],
       [ 10.5      ,  10.       ],
       [ 10.75     ,   9.5669873],
       [ 10.25     ,   9.5669873],
       [  2.       ,   2.       ],
       [  3.       ,   2.       ],
       [  2.5      ,   1.1339746],
       [  2.5      ,   2.       ],
       [  2.75     ,   1.5669873],
       [  2.25     ,   1.5669873]]

我现在想要一个不包含第一个列表的某些元素的列表。

exclude = [[2., 2.], [3., 2.], [2.5, 2.]]

很遗憾

new_list = [p for p in points if p not in exclude]

会产生

[[ 10.       ,  10.       ],
 [ 11.       ,  10.       ],
 [ 10.5      ,   9.1339746],
 [ 10.5      ,  10.       ],
 [ 10.75     ,   9.5669873],
 [ 10.25     ,   9.5669873],
 [  2.75     ,   1.5669873],
 [  2.25     ,   1.5669873]]

代替

[[ 10.       ,  10.       ],
 [ 11.       ,  10.       ],
 [ 10.5      ,   9.1339746],
 [ 10.5      ,  10.       ],
 [ 10.75     ,   9.5669873],
 [ 10.25     ,   9.5669873],
 [  2.5      ,   1.1339746],
 [  2.75     ,   1.5669873],
 [  2.25     ,   1.5669873]]

似乎 Python 删除了这里至少有一个共同元素的所有元素（而不是所有共同元素 :/ ）。

如果元素没有完全包含在第一个列表中，是否有任何好的/简短/优雅的方式来排除元素？

score 2 · Accepted Answer

注意：由于这个问题已被标记numpy，我假设points它是一个 NumPy 数组。如果这是真的，您可以使用np.logical_andand生成一个布尔掩码（数组） np.logical_or：

import numpy as np

points = np.array(
      [[ 10.       ,  10.       ],
       [ 11.       ,  10.       ],
       [ 10.5      ,   9.1339746],
       [ 10.5      ,  10.       ],
       [ 10.75     ,   9.5669873],
       [ 10.25     ,   9.5669873],
       [  2.       ,   2.       ],
       [  3.       ,   2.       ],
       [  2.5      ,   1.1339746],
       [  2.5      ,   2.       ],
       [  2.75     ,   1.5669873],
       [  2.25     ,   1.5669873]])

exclude = [[2., 2.], [3., 2.], [2.5, 2.]]

mask = np.logical_or.reduce(
    [np.logical_and.reduce(
        [points[:,idx] == ex[idx] for idx in range(len(ex))]) for ex in exclude])

new_points = points[~mask]
print(new_points)

印刷

[[ 10.         10.       ]
 [ 11.         10.       ]
 [ 10.5         9.1339746]
 [ 10.5        10.       ]
 [ 10.75        9.5669873]
 [ 10.25        9.5669873]
 [  2.5         1.1339746]
 [  2.75        1.5669873]
 [  2.25        1.5669873]]

score 0 · Accepted Answer

您还可以将二维数组视为一维数组，然后使用np.in1d.

#Using @unutbu array's.

def view_1d(arr):
    return arr.view(np.dtype((np.void,arr.dtype.itemsize * arr.shape[1])))

points_1d=view_1d(points)
exclude_1d=view_1d(exclude)

print points[~np.in1d(points_1d,exclude_1d)]

[[ 10.         10.       ]
 [ 11.         10.       ]
 [ 10.5         9.1339746]
 [ 10.5        10.       ]
 [ 10.75        9.5669873]
 [ 10.25        9.5669873]
 [  2.5         1.1339746]
 [  2.75        1.5669873]
 [  2.25        1.5669873]]

只是为了仔细检查诡计是否有效以及一些大致时间：

points=np.random.rand(1E6,2)
points=np.around(points,1)

exclude=np.random.rand(1E2,2)
exclude=np.around(exclude,1)


t = time.time()
mask1 = ~(np.in1d(view_1d(points),view_1d(exclude)))

print time.time()-t
#0.469238996506

t = time.time()
mask2 = ~np.logical_or.reduce(
    [np.logical_and.reduce(
        [points[:,idx] == ex[idx] for idx in range(len(ex))]) for ex in exclude])

print time.time()-t
#7.13628792763

#Just to check this is doing what I think its doing.
print np.all(mask1==mask2)
True

时序仅用于生成掩码。两种方法的扩展性似乎相似，我只是展示了大型数组以（希望）弥补不使用timeit.

score -1 · Accepted Answer

您发布的代码在语法上无效。
它会产生（一旦更正）您想要的结果：

--

[[10.0, 10.0],
 [11.0, 10.0],
 [10.5, 9.1339746],
 [10.5, 10.0],
 [10.75, 9.5669873],
 [10.25, 9.5669873],
 [2.5, 1.1339746],
 [2.75, 1.5669873],
 [2.25, 1.5669873]]

见：http: //ideone.com/7LOpa6

points =  [[ 10.       ,  10.       ],
       [ 11.       ,  10.       ],
       [ 10.5      ,   9.1339746],
       [ 10.5      ,  10.       ],
       [ 10.75     ,   9.5669873],
       [ 10.25     ,   9.5669873],
       [  2.       ,   2.       ],
       [  3.       ,   2.       ],
       [  2.5      ,   1.1339746],
       [  2.5      ,   2.       ],
       [  2.75     ,   1.5669873],
       [  2.25     ,   1.5669873]]
exclude = [[2., 2.], [3., 2.], [2.5, 2.]]

print [p for p in points if p not in exclude]

python - 不包含在其他数组列表中的所有数组的列表

3 回答 3

Related

Reference