python - numpy.loadtxt 后如何操作数据？

Question

我有如下原始数据。例如，我们加载文本文件，其中第一行是 xlabel，第一列是 ylabel。让我们调用文件名是'131014-data-xy-conv-1.txt'。

Y/X (mm),   0,  10, 20, 30, 40
686.6,  -5.02,  -0.417, 0,  100.627,    0
694.08, -5.02,  -4.529, -17.731,    -5.309, -3.535
701.56, 1.869,  -4.529, -17.731,    -5.309, -3.535
709.04, 1.869,  -4.689, -17.667,    -5.704, -3.482
716.52, 4.572,  -4.689, -17.186,    -5.704, -2.51 
724,    4.572,  -4.486, -17.186,    -5.138, -2.51
731.48, 6.323,  -4.486, -16.396,    -5.138, -1.933
738.96, 6.323,  -4.977, -16.396,    -5.319, -1.933
746.44, 7.007,  -4.251, -16.577,    -5.319, -1.688
753.92, 7.007,  -4.251, -16.577,    -5.618, -1.688
761.4,  7.338,  -3.514, -16.78, -5.618, -1.207
768.88, 7.338,  -3.514, -16.78, -4.657, -1.207
776.36, 7.263,  -3.877, -15.99, -4.657, -0.822

(Q1) 正如您所看到的原始数据，它们分别在第一行第一列有 xlabel 和 ylabel。如果我使用 numpy.loadtxt 函数，如何拆分“xs”和“ys”？

rawdata = numpy.loadtxt('131014-data-xy-conv-1.txt')
xs, ys, data = func(rawdata)

我必须实现额外的逻辑吗？还是有什么功能？

score 5 · Accepted Answer

实际上，np.loadtxt不能很好地单独处理第一行，所以你必须做一些聪明的事情。我会给出两种方法，第一种更短，但第二种更直接

1）您可以通过将第一行读取为标题名称来执行此“破解” ：

y_and_data = np.genfromtxt('131014-data-xy-conv-1.txt', names=True, delimiter=',')
x = np.array(y_and_data.dtype.names[1:], int)
y = y_and_data['YX_mm']
data = y_and_data.view(np.float).reshape(-1, len(y_and_data.dtype))[:,1:]

2）但我建议先单独阅读第一行，保存，然后打开其余部分loadtxt（或genfromtxt按照我使用和推荐的方式）：

with open('131014-data-xy-conv-1.txt', 'r') as f:
    x = np.array(f.readline().split(',')[1:], int)
    y_and_data = np.genfromtxt(f, delimiter=',')
y = y_and_data[:,0]
data = y_and_data[:,1:]

它是如何工作的，首先打开文件，然后调用它f：

with open('131014-data-xy-conv-1.txt', 'r') as f:

    firstline = f.readline()           # read off the first line
    firstvalues = firstline.split(',') # split it on the comma
    xvalues = firstvalues[1:]          # and keep the all but the first elements
    x = np.array(xvalues, int)         # make it an array of integers (or float if you prefer)

现在已经从fusing读取了第一行f.readline，可以使用以下命令读取其余部分genfromtxt：

    y_and_data = np.genfromtxt(f, delimiter=',')

现在，其他答案显示了如何拆分其余部分：

y = y_and_data[:,0]       # the first column is the y-values
data = y_and_data[:,1:]   # the remaining columns are the data

这是输出：

In [58]: with open('131014-data-xy-conv-1.txt', 'r') as f:
   ....:     x = np.array(f.readline().split(',')[1:], int)
   ....:     y_and_data = np.genfromtxt(f, delimiter=',')
   ....: y = y_and_data[:,0]
   ....: data = y_and_data[:,1:]
   ....: 

In [59]: x
Out[59]: array([ 0, 10, 20, 30, 40])

In [60]: y
Out[60]: 
array([ 686.6 ,  694.08,  701.56,  709.04,  716.52,  724.  ,  731.48,
        738.96,  746.44,  753.92,  761.4 ,  768.88,  776.36])

In [61]: data
Out[61]: 
array([[  -5.02 ,   -0.417,    0.   ,  100.627,    0.   ],
       [  -5.02 ,   -4.529,  -17.731,   -5.309,   -3.535],
       [   1.869,   -4.529,  -17.731,   -5.309,   -3.535],
       [   1.869,   -4.689,  -17.667,   -5.704,   -3.482],
       [   4.572,   -4.689,  -17.186,   -5.704,   -2.51 ],
       [   4.572,   -4.486,  -17.186,   -5.138,   -2.51 ],
       [   6.323,   -4.486,  -16.396,   -5.138,   -1.933],
       [   6.323,   -4.977,  -16.396,   -5.319,   -1.933],
       [   7.007,   -4.251,  -16.577,   -5.319,   -1.688],
       [   7.007,   -4.251,  -16.577,   -5.618,   -1.688],
       [   7.338,   -3.514,  -16.78 ,   -5.618,   -1.207],
       [   7.338,   -3.514,  -16.78 ,   -4.657,   -1.207],
       [   7.263,   -3.877,  -15.99 ,   -4.657,   -0.822]])

score 1 · Accepted Answer

如果你只想要xs, ys, 和data在单独的数组中，你可以这样做：

xs = np.array(open('131014-data-xy-conv-1.txt').readline().split(',')[1:], int)
rawdata = numpy.loadtxt('131014-data-xy-conv-1.txt', skiprows=1)
ys = rawdata[:, 0]
data = rawdata[:, 1:]

注意skiprows忽略文件第一行的关键字。

score 1 · Accepted Answer

添加到@bogatron 的答案中，您可以传递参数unpack=True以获取xs, ys, data一行：

xs, ys, data = numpy.loadtxt('131014-data-xy-conv-1.txt', skiprows=1, unpack=True)

python - numpy.loadtxt 后如何操作数据？

3 回答 3

Related

Reference