python - 如何以向量化的方式在特定轴上找到二维数组的唯一向量？

Question

我有一系列形状(n,t)，我想将其视为n-vectors.

我想知道每个唯一向量n-vector存在的唯一值t-dimension以及关联的唯一值。t-indices我很乐意使用任何合理的平等定义（例如numpy.unique将采取浮动）

使用 Python 循环很容易，t但我希望采用矢量化方法。

在某些特殊情况下，它可以通过折叠n-vectors成标量（并numpy.unique在 1d 结果上使用）来完成，例如，如果你有布尔值，你可以使用向量化dot的(2**k)向量来将（布尔向量）转换为整数，但我正在寻找一个相当普遍的解决方案。

score 5 · Accepted Answer

如果您的数组的形状是 (t, n) - 因此每个 n 向量的数据在内存中是连续的 - 您可以将二维数组的视图创建为一维结构化数组，然后使用numpy.unique 在这个视图上。

如果您可以更改数组的存储约定，或者您不介意制作转置数组的副本，那么这可能对您有用。

这是一个例子：

import numpy as np

# Demo data.
x = np.array([[1,2,3],
              [2,0,0],
              [1,2,3],
              [3,2,2],
              [2,0,0],
              [2,1,2],
              [3,2,1],
              [2,0,0]])

# View each row as a structure, with field names 'a', 'b' and 'c'.
dt = np.dtype([('a', x.dtype), ('b', x.dtype), ('c', x.dtype)])
y = x.view(dtype=dt).squeeze()

# Now np.unique can be used.  See the `unique` docstring for
# a description of the options.  You might not need `idx` or `inv`.
u, idx, inv = np.unique(y, return_index=True, return_inverse=True)

print("Unique vectors")
print(u)

python - 如何以向量化的方式在特定轴上找到二维数组的唯一向量？

1 回答 1

Related

Reference