-2

我有一个 csv 文件,我想从中提取一些特定的列。我怎样才能做到这一点?
我有一个标题字典和单元格位置,例如:

dict = {'Col1' : [(4,5)], 'Col2' : [(4,7)], 'Col3' : [(4,9)]}

我想从dict的值开始提取数据,直到csv文件的结尾!

例如:

,,,,,,,,,,
,,,,,,,,,,
,,,,,,,,,,
,,,Col0,Col1,,Col2,,Col3,Col4,
,,,bgr,abc,,efg,,hij,123,
,,,cde,klm,,nop,,qrs,123,
,,,asd,tuv,,wxy,,zzz,456,
,,,,,,,,,,
,,,,,,,,,,

我要提取

Col1,Col2,Col3
abc,efg,hij
klm,nop,qrs
tuv,wxy,zzz

并将其写入一个新的 csv 文件!请帮我这样做!
我想有效地处理这种情况!

4

1 回答 1

1

Pandas是一个具有强大方法来读取 csv 文件的库。

如果您想从同一行读取每一列,以下脚本将完成工作(请注意,只有 2 行 python 行是有用的):

import pandas as pd


# Give the name of the columns
colnames = ('skip1', 'skip2', 'skip3', 'Col0','Col1','skip4','Col2','skip5','Col3','Col4','skip6')
# Give the number of lines to skip
nbskip=4
# Give the number of rows to read (you can also filter rows after reading and remove the empty ones)
nrows=3
#List of columns to keep
keep_only = ('Col1','Col2','Col3')

#Read the csv
df =  pd.io.parsers.read_csv('test.csv', 
                 header=None,
                 skiprows=nbskip,
                 names=colnames,
                 nrows=nrows, # Remove if you prefer filter rows
                 usecols=keep_only)

# If the number of lines to keep is unknow,
# you can remove empty lines here

#Save the csv
df.to_csv('result.csv', index=False)
于 2013-02-26T07:59:00.737 回答