python - 在numpy中转换为数组时列表元素的反直觉截断？

Question

我注意到numpy. 我有一个要转换为数组的列表列表：

>>> a = [['abc', 117858348, 117858388, 'def']]

当我将它转换为数组时，它会将元素转换为字符串（这很好），但意外地丢弃了两个中间元素的最后一位：

>>> array(a)
array([['abc', '11785834', '11785838', 'def']], 
      dtype='|S8')

这是什么原因？有没有办法没有这种行为？将列表转换为数组很方便的原因是为了快速索引某些元素。例如，如果您有一个x数组索引列表a，您可以a[x]检索它们。如果a是列表列表，则不能，而是必须执行类似[a[i] for i in x].

谢谢。

score 4 · Accepted Answer

这很有趣，运行你的例子给了我这个：

>>> numpy.asarray([['abc', 117858348, 117858388, 'def']])
array([['abc', '117', '117', 'def']], 
      dtype='|S3')

我很想知道转换是如何工作的：

>>> help(numpy.asarray)
asarray(a, dtype=None, order=None)
Convert the input to an array.

Parameters
----------
a : array_like
    Input data, in any form that can be converted to an array.  This
    includes lists, lists of tuples, tuples, tuples of tuples, tuples
    of lists and ndarrays.
dtype : data-type, optional
    By default, the data-type is inferred from the input data.

看起来底层类型是inferred from the input data，我想知道这意味着什么，所以我做了

>>> import inspect
>>> print inspect.getsource(numpy.asarray)

我们得到return array(a, dtype, copy=False, order=order)但是numpy.array是内置的，所以通过http://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html的文档我们得到：

dtype：数据类型，可选
数组所需的数据类型。如果没有给出，那么类型将被确定为在序列中保存对象所需的最小类型。此参数只能用于“向上转换”数组。对于向下转换，请使用 .astype(t) 方法。

好吧，它看起来尽可能向上转换，所以在我的情况下向上转换为长度为 3 的字符串，因为那是我在序列中拥有的最长字符串，如果我引入更长的字符串，它会向上转换，似乎在我的如果它没有正确考虑其他类型的数字长度，这可能是一个错误，我不知道......

你可以只指定一个长字符串序列

>>> numpy.asarray([['abc', 117858348, 117858388, 'defs']], dtype = 'S20')
array([['abc', '117858348', '117858388', 'defs']], 
  dtype='|S20')

20个字符似乎绰绰有余，虽然它可能会消耗更多的内存，所以你可以简单地将它设置为最大值......

据我所知numpy，将值存储为同质类型，这就是为什么在创建数组时一切都必须是预先确定的类型。

>>> numpy.__version__
'1.6.1'

$ python --version
Python 2.6.1

$ uname -a
Darwin 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun  7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386

我希望这有帮助。

score 4 · Accepted Answer

如果您使用对象数组，则不会有任何截断。这也将允许您混合不同的类型，并获得所有索引的说服力。

a = [['abc', 117858348, 117858388, 'def']]
a = array(a, dtype=object)
type(a[0, 0])
# <type str>
type(a[0, 1])
# <type int>

python - 在numpy中转换为数组时列表元素的反直觉截断？

2 回答 2

Related

Reference