3

我正在使用一个大型数据集 (OMNI),我正在寻找解析数据并将每行数据放入一个列表的数组的方法。我对 Python 还很陌生,所以我边走边学。

这就是我所拥有的:

import Tkinter, tkFileDialog
import csv 

#Choose the file that you want to read from
root = Tkinter.Tk()
root.withdraw()


file_path = tkFileDialog.askopenfilename()
current_file = open(file_path , "r")

#OMNI_2001 = {}

reader = csv.reader(current_file, delimiter= ' ')

output_file = open('newdata.txt','w')
out = csv.writer(output_file)

for row in reader:
    out.writerow(row)
    print row
#print row[0::1]

我读入的一行数据如下所示:

2001 182  0  0 60 60   7   2  71   -695    320  0.22   -173    6.07    5.23    0.46   -2.00    0.69   -1.93    0.38    2.09   331.0  -329.5    24.5    19.8   8.66  101479.  1.90   0.64   2.25   8.0    6.67   29.65    3.55   12.73   -1.78   -0.70   288  -142   146    -3   -22    20    19   0.99

但在我输出新数据后,如下所示:

2001,182,,0,,0,60,60,,,7,,,2,,71,,,-695,,,,320,,0.22,,,-173,,,,6.07,,,,5.23,,,,0.46,,,-2.00,,,,0.69,,,-1.93,,,,0.38,,,,2.09,,,331.0,,-329.5,,,,24.5,,,,19.8,,,8.66,,101479.,,1.90,,,0.64,,,2.25,,,8.0,,,,6.67,,,29.65,,,,3.55,,,12.73,,,-1.78,,,-0.70,,,288,,-142,,,146,,,,-3,,,-22,,,,20,,,,19,,,0.99

我在做什么导致这么多额外的逗号?另外我将如何删除不需要的条目?

4

2 回答 2

14

您的 csv 文件在项目之间有多个空格。delimiter=' '使读者将每个空格视为分隔一个新列。这就是为什么行有这么多“额外”列的原因。

使用skipinitialspace=True导致紧跟在分隔符后面的空格被忽略。这将消除虚假的额外列。

import Tkinter, tkFileDialog
import csv 

#Choose the file that you want to read from
root = Tkinter.Tk()
root.withdraw()

file_path = tkFileDialog.askopenfilename()
with open(file_path , 'rb') as current_file:
    reader = csv.reader(current_file, delimiter= ' ', 
                        skipinitialspace=True)
    with open('newdata.txt','wb') as output_file:
        out = csv.writer(output_file)
        for row in reader:
            out.writerow(row)
            print row
            #print row[0::1]
于 2013-06-04T16:07:15.573 回答
3

您的文件似乎并不是 CSV 文件。我建议使用loadtxt()genfromtxt()来自 NumPy 模块,或者,如果不能使用 NumPy,请自己解析文件:

with open(file_path) as current_file:
    for line in current_file:
        data_row = map(float, line.split())
        # do whatever you want to do with the data
于 2013-06-04T16:28:20.240 回答