python - 使用 numpy 的 genfromtxt 读取每一行的最快方法

Question

我用 numpy 的 genfromtxt 读取了我的数据：

import numpy as np
measurement = np.genfromtxt('measurementProfile2.txt', delimiter=None, dtype=None, skip_header=4, skip_footer=2, usecols=(3,0,2))
rows, columns = np.shape(measurement)
x=np.zeros((rows, 1), dtype=measurement.dtype)
x[:]=394
measurement = np.hstack((measurement, x))
np.savetxt('measurementProfileFormatted.txt',measurement)

这很好用。但我只想要最终输出文件中5-th的6-th(so n-th) 行。根据numpy.genfromtxt.html没有参数可以做到这一点。我不想迭代数组。有没有推荐的方法来处理这个问题？

score 4 · Accepted Answer

为避免读取整个数组，您可以结合np.genfromtxt使用itertools.islice来跳过行。这比读取整个数组然后切片要快一些（至少对于我尝试过的较小的数组）。

例如，这里的内容是file.txt：

然后例如：

>>> import itertools
>>> with open('file.txt') as f_in:
        x = np.genfromtxt(itertools.islice(f_in, 0, None, 3), dtype=int)

返回包含上述文件x的0,3和索引元素的数组：6

array([12, 17, 62])

score 0 · Accepted Answer

如果您只想要最终输出文件中的特定行，那么为什么不只保存这些行而不是保存整个“测量”矩阵：

output_rows = [5,7,11]
np.savetxt('measurementProfileFormatted.txt',measurement[output_rows,:])

score 0 · Accepted Answer

无论如何，您必须阅读整个文件，以选择第 n 个元素，执行以下操作：

>>> a = np.arange(50)
>>> a[::5]
array([ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45])

python - 使用 numpy 的 genfromtxt 读取每一行的最快方法

3 回答 3

Related

Reference