python - 测试二维 numpy 数组中的成员资格

Question

我有两个相同大小的二维数组

a = array([[1,2],[3,4],[5,6]])
b = array([[1,2],[3,4],[7,8]])

我想知道 a 中 b 的行。

所以输出应该是：

array([ True,  True, False], dtype=bool)

不做：

array([any(i == a) for i in b])

因为 a 和 b 很大。

有一个函数可以做到这一点，但仅适用于一维数组：in1d

score 13 · Accepted Answer

我们真正想做的是使用np.in1d... 除了np.in1d只适用于一维数组。我们的数组是多维的。但是，我们可以将数组视为字符串的一维数组：

arr.view(np.dtype((np.void, arr.dtype.itemsize * arr.shape[-1])))

例如，

In [15]: arr = np.array([[1, 2], [2, 3], [1, 3]])

In [16]: arr = arr.view(np.dtype((np.void, arr.dtype.itemsize * arr.shape[-1])))

In [30]: arr.dtype
Out[30]: dtype('V16')

In [31]: arr.shape
Out[31]: (3, 1)

In [37]: arr
Out[37]: 
array([[b'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00'],
       [b'\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00'],
       [b'\x01\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00']],
      dtype='|V16')

这使得arr字符串的每一行。现在只需将其连接到np.in1d：

import numpy as np

def asvoid(arr):
    """
    Based on http://stackoverflow.com/a/16973510/190597 (Jaime, 2013-06)
    View the array as dtype np.void (bytes). The items along the last axis are
    viewed as one value. This allows comparisons to be performed on the entire row.
    """
    arr = np.ascontiguousarray(arr)
    if np.issubdtype(arr.dtype, np.floating):
        """ Care needs to be taken here since
        np.array([-0.]).view(np.void) != np.array([0.]).view(np.void)
        Adding 0. converts -0. to 0.
        """
        arr += 0.
    return arr.view(np.dtype((np.void, arr.dtype.itemsize * arr.shape[-1])))


def inNd(a, b, assume_unique=False):
    a = asvoid(a)
    b = asvoid(b)
    return np.in1d(a, b, assume_unique)


tests = [
    (np.array([[1, 2], [2, 3], [1, 3]]),
     np.array([[2, 2], [3, 3], [4, 4]]),
     np.array([False, False, False])),
    (np.array([[1, 2], [2, 2], [1, 3]]),
     np.array([[2, 2], [3, 3], [4, 4]]),
     np.array([True, False, False])),
    (np.array([[1, 2], [3, 4], [5, 6]]),
     np.array([[1, 2], [3, 4], [7, 8]]),
     np.array([True, True, False])),
    (np.array([[1, 2], [5, 6], [3, 4]]),
     np.array([[1, 2], [5, 6], [7, 8]]),
     np.array([True, True, False])),
    (np.array([[-0.5, 2.5, -2, 100, 2], [5, 6, 7, 8, 9], [3, 4, 5, 6, 7]]),
     np.array([[1.0, 2, 3, 4, 5], [5, 6, 7, 8, 9], [-0.5, 2.5, -2, 100, 2]]),
     np.array([False, True, True]))
]

for a, b, answer in tests:
    result = inNd(b, a)
    try:
        assert np.all(answer == result)
    except AssertionError:
        print('''\
a:
{a}
b:
{b}

answer: {answer}
result: {result}'''.format(**locals()))
        raise
else:
    print('Success!')

产量

Success!

score 4 · Accepted Answer

In [1]: import numpy as np

In [2]: a = np.array([[1,2],[3,4]])

In [3]: b = np.array([[3,4],[1,2]])

In [5]: a = a[a[:,1].argsort(kind='mergesort')]

In [6]: a = a[a[:,0].argsort(kind='mergesort')]

In [7]: b = b[b[:,1].argsort(kind='mergesort')]

In [8]: b = b[b[:,0].argsort(kind='mergesort')]

In [9]: bInA1 = b[:,0] == a[:,0]

In [10]: bInA2 = b[:,1] == a[:,1]

In [11]: bInA = bInA1*bInA2

In [12]: bInA
Out[12]: array([ True,  True], dtype=bool)

应该这样做......不确定，这是否仍然有效。您需要 do mergesort，因为其他方法不稳定。

编辑：

如果您有超过 2 列并且行已经排序，您可以这样做

In [24]: bInA = np.array([True,]*a.shape[0])

In [25]: bInA
Out[25]: array([ True,  True], dtype=bool)

In [26]: for k in range(a.shape[1]):
    bInAk = b[:,k] == a[:,k]
    bInA = bInAk*bInA
   ....:     

In [27]: bInA
Out[27]: array([ True,  True], dtype=bool)

还有加速的空间，因为在迭代中，您不必检查整个列，而只需检查 current 所在的bInA条目True。

score 2 · Accepted Answer

如果你喜欢a=np.array([[1,2],[3,4],[5,6]])and b=np.array([[5,6],[1,2],[7,6]])，你可以将它们转换成复杂的一维数组：

c=a[:,0]+a[:,1]*1j
d=b[:,0]+b[:,1]*1j

我的解释器中的整个内容如下所示：

>>> c=a[:,0]+a[:,1]*1j
>>> c
array([ 1.+2.j,  3.+4.j,  5.+6.j])
>>> d=b[:,0]+b[:,1]*1j
>>> d
array([ 5.+6.j,  1.+2.j,  7.+6.j])

现在您只有一维数组，您可以轻松做到np.in1d(c,d)，Python 将为您提供：

>>> np.in1d(c,d)
array([ True, False,  True], dtype=bool)

有了这个，你不需要任何循环，至少对于这种数据类型

score 0 · Accepted Answer

numpy 模块实际上可以通过您的数组进行广播，并告诉哪些部分与其他部分相同，如果它们是则返回 true，否则返回 false：

import numpy as np
a = np.array(([1,2],[3,4],[5,6])) #converting to a numpy array
b = np.array(([1,2],[3,4],[7,8])) #converting to a numpy array
new_array = a == b #creating a new boolean array from comparing a and b

现在 new_array 看起来像这样：

[[ True  True]
 [ True  True]
 [False False]]

但这不是你想要的。因此，您可以转置（翻转 x 和 y）数组，然后将两行与&门进行比较。这将创建一个一维数组，仅当行中的两列都为真时才返回真：

new_array = new_array.T #transposing
result = new_array[0] & new_array[1] #comparing rows

当你print result现在得到你正在寻找的东西时：

[ True  True False]

python - 测试二维 numpy 数组中的成员资格

4 回答 4

Related

Reference