python - 将列表转换为 numpy 数组时的 dtype 用法

Question

dtype创建 numpy 数组时，我很困惑。我正在从花车列表中创建它们。首先让我注意这不是打印问题，因为我已经这样做了：np.set_printoptions(precision=18).

这是我的清单的一部分：

In [37]: boundary
Out[37]: 
[['3366307.654296875', '5814192.595703125'],
 ['3366372.2244873046875', '5814350.752685546875'],
 ['3366593.37969970703125', '5814844.73492431640625'],
 ['3367585.4779052734375', '5814429.293701171875'],
 ['3367680.55389404296875', '5814346.618896484375'],
 ....
 [ 3366307.654296875     ,  5814192.595703125     ]]

然后我将它转换为一个 numpy 数组：

In [43]: boundary2=np.asarray(boundary, dtype=float)   
In [44]: boundary2
Out[44]: 
array([[ 3366307.654296875     ,  5814192.595703125     ],
       [ 3366372.2244873046875 ,  5814350.752685546875  ],
       [ 3366593.37969970703125,  5814844.73492431640625],
        ....
       [ 3366307.654296875     ,  5814192.595703125     ]])
# the full number of significant digits is preserved. 
# this also works with:
In [45]: boundary2=np.array(boundary, dtype=float)

In [46]: boundary2
Out[46]: 
array([[ 3366307.654296875     ,  5814192.595703125     ],
     [ 3366372.2244873046875 ,  5814350.752685546875  ],
     [ 3366593.37969970703125,  5814844.73492431640625],
     ...
     [ 3366307.654296875     ,  5814192.595703125     ]])

# This also works with dtype=np.float
In [56]: boundary3=np.array(boundary, dtype=np.float)
In [57]: boundary3
Out[57]: 
array([[ 3366307.654296875     ,  5814192.595703125     ],
       [ 3366372.2244873046875 ,  5814350.752685546875  ],
       [ 3366593.37969970703125,  5814844.73492431640625],
       ....
       [ 3366307.654296875     ,  5814192.595703125     ]])

这就是我感到困惑的原因，如果我使用dtype=np.float32它似乎丢失了有效数字：

In [58]: boundary4=np.array(boundary, dtype=np.float32)   
In [59]: boundary4
Out[59]: 
array([[ 3366307.75,  5814192.5 ],
       [ 3366372.25,  5814351.  ],
       [ 3366593.5 ,  5814844.5 ],
       [ 3367585.5 ,  5814429.5 ],
       ...
       [ 3366307.75,  5814192.5 ]], dtype=float32)

我之所以说它似乎是因为显然数组是相同的。我无法直接查看数据，但检查np.allclose返回 True：

In [65]: np.allclose(boundary2, boundary4)
Out[65]: True

所以，如果你读到这里，我希望你明白我为什么感到困惑，也许有人可以回答以下两个问题：

为什么要dtype=float32“隐藏”我的数据？
我应该担心它还是可以安全地继续使用dtype=float？

score 4 · Accepted Answer

所有浮点类型都具有有限的精度。它们可以存储的有效位数取决于浮点类型中的位数。如果您提供或as float，则使用 64 位（“双精度”），产生大约 16 个有效十进制数字。对于，使用 32 位（“单精度”），产生大约 8 个有效十进制数字。所以没有什么是“隐藏的”，你只是看到有限浮点精度的影响。返回，因为所有值都接近您选择的浮点类型的限制。numpy.floatnumpy.float64dtypenumpy.float32numpy.allclose()True

python - 将列表转换为 numpy 数组时的 dtype 用法

1 回答 1

Related

Reference