1

这是我第一次在这里发帖。我正在尝试学习一点 Python。使用 Python 3 和 numpy。

做了一些教程然后决定潜入并尝试一个我可能会在工作中发现有用的小项目,因为这对我来说是学习的好方法。

我编写了一个程序,它从具有几行标题的 CSV 文件中读取数据,然后我想根据标题名称从该文件中提取某些列,然后以特定格式将其输出回新的 csv 文件.

我拥有的程序运行良好并且可以执行我想要的操作,但是由于我是新手,我想要一些关于如何改进代码的提示。

我的主要数据文件(csv)长约 57 列,深约 36 行,所以不大。

它工作正常,但正在寻找建议和改进。

import csv
import numpy as np

#make some arrays..at least I think thats what this does
A=[]
B=[]
keep_headers=[]

#open the main data csv file 'map.csv'...need to check what 'r' means
input_file = open('map.csv','r')

#read the contents of the file into 'data'
data=csv.reader(input_file, delimiter=',')

#skip the first 2 header rows as they are junk
next(data)
next(data)

#read in the next line as the 'header'
headers = next(data)

#Now read in the numeric data (float) from the main csv file 'map.csv'
A=np.genfromtxt('map.csv',delimiter=',',dtype='float',skiprows=5)

#Get the length of a column in A
Alen=len(A[:,0])

#now read the column header values I want to keep from 'keepheader.csv'
keep_headers=np.genfromtxt('keepheader.csv',delimiter=',',dtype='unicode_')

#Get the length of keep headers....i.e. how many headers I'm keeping. 
head_len=len(keep_headers)

#Now loop round extracting all the columns with the keep header titles and
#append them to array B
i=0
while i < head_len:
    #use index to find the apprpriate column number. 
    item_num=headers.index(keep_headers[i])
    i=i+1

    #append the selected column to array B
    B=np.append(B,A[:,item_num])

#now reshape the B array 
B=np.reshape(B,(head_len,36))

#now transpose it as thats the format I want. 
B=np.transpose(B)

#save the array B back to a new csv file called 'cmap.csv'
np.savetxt('cmap.csv',B,fmt='%.3f',delimiter=",") 

谢谢。

4

1 回答 1

1

您可以使用更多功能大大简化您的代码numpy

A = np.loadtxt('stack.txt',skiprows=2,delimiter=',',dtype=str)
keep_headers=np.loadtxt('keepheader.csv',delimiter=',',dtype=str)

headers = A[0,:]
cols_to_keep = np.in1d( headers, keep_headers )

B = np.float_(A[1:,cols_to_keep])
np.savetxt('cmap.csv',B,fmt='%.3f',delimiter=",")
于 2013-07-05T20:58:02.603 回答