python - 使用 numpy.take 键入转换错误

Question

uint8我有一个存储 65536 个值的查找表 (LUT) ：

lut = np.random.randint(256, size=(65536,)).astype('uint8')

我想使用这个 LUT 来转换uint16s 数组中的值：

arr = np.random.randint(65536, size=(1000, 1000)).astype('uint16')

我想就地进行转换，因为最后一个数组可能会变得很大。当我尝试时，会发生以下情况：

>>> np.take(lut, arr, out=arr)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 103, in take
    return take(indices, axis, out, mode)
TypeError: array cannot be safely cast to required type

而且我不明白发生了什么。我知道，如果没有out参数，返回的 dtype 与lut, 所以uint8. 但是为什么不能将 auint8转换为 a uint16？如果你问 numpy：

>>> np.can_cast('uint8', 'uint16')
True

显然，以下工作：

>>> lut = lut.astype('uint16')
>>> np.take(lut, arr, out=arr)
array([[173, 251, 218, ..., 110,  98, 235],
       [200, 231,  91, ..., 158, 100,  88],
       [ 13, 227, 223, ...,  94,  56,  36],
       ..., 
       [ 28, 198,  80, ...,  60,  87, 118],
       [156,  46, 118, ..., 212, 198, 218],
       [203,  97, 245, ...,   3, 191, 173]], dtype=uint16)

但这也有效：

>>> lut = lut.astype('int32')
>>> np.take(lut, arr, out=arr)
array([[ 78, 249, 148, ...,  77,  12, 167],
       [138,   5, 206, ...,  31,  43, 244],
       [ 29, 134, 131, ..., 100, 107,   1],
       ..., 
       [109, 166,  14, ...,  64,  95, 102],
       [152, 169, 102, ..., 240, 166, 148],
       [ 47,  14, 129, ..., 237,  11,  78]], dtype=uint16)

这真的没有意义，因为 now int32s 被强制转换为uint16s，这绝对不是一件安全的事情：

>>> np.can_cast('int32', 'uint16')
False

如果我将lut's dtype设置为uint16, uint32, uint64, int32or中的任何内容，我的代码就可以工作int64，但对于uint8,int8和int16.

我是否遗漏了什么，或者这只是在 numpy 中被破坏了？

也欢迎解决方法......由于LUT不是那么大，我想让它的类型匹配数组的类型并没有那么糟糕，即使这需要两倍的空间，但这样做感觉不对。 ..

有没有办法告诉 numpy 不要担心铸造安全？

score 2 · Accepted Answer

有趣的问题。numpy.take(lut, ...)被转换成lut.take(...)可以在这里查看其来源：

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/item_selection.c#L28

我相信在第 105 行抛出异常：

obj = (PyArrayObject *)PyArray_FromArray(out, dtype, flags);
if (obj == NULL) {
    goto fail;
}

在你的情况下out是arr但是dtype其中之一lut，即uint8。所以它尝试强制转换arr为uint8，但失败了。我不得不说我不确定它为什么需要这样做，只是指出它确实......由于某种原因take似乎假设您希望作为输出数组具有dtype与lut.

顺便说一句，在许多情况下，调用PyArray_FromArray实际上会创建一个新数组，而替换不会到位。例如，如果您take使用mode='raise'（默认值，以及您的示例中发生的情况）调用，或者无论何时调用，就是这种情况lut.dtype != arr.dtype。好吧，至少它应该，而且我无法解释为什么，当你投射lut到int32输出数组时仍然存在uint16！这对我来说是个谜——也许它与NPY_ARRAY_UPDATEIFCOPY标志有关（另见此处）。

底线：

numpy 的行为确实很难理解......也许其他人会提供一些关于它为什么会这样做的见解
我不会尝试arr就地处理 - 无论如何，在大多数情况下，似乎在引擎盖下创建了一个新数组。我会简单地选择arr = lut.take(arr)- 顺便说一下，它最终会释放arr.

python - 使用 numpy.take 键入转换错误

1 回答 1

Related

Reference