2

虽然我可以读取 csv 文件而不是读取整个文件,但我怎样才能只打印某些行和列?

想象一下,好像这​​是 Excel:

  A              B              C                  D                    E
State  |Heart Disease Rate| Stroke Death Rate | HIV Diagnosis Rate |Teen Birth Rate

Alabama     235.5             54.5                 16.7                 18.01

Alaska      147.9             44.3                  3.2                  N/A    

Arizona     152.5             32.7                 11.9                  N/A    

Arkansas    221.8             57.4                 10.2                  N/A    

California  177.9             42.2                  N/A                  N/A    

Colorado    145.3             39                    8.4                 9.25    

这是我所拥有的:

import csv

try:
    risk = open('riskfactors.csv', 'r', encoding="windows-1252").read() #find the file

except:
    while risk != "riskfactors.csv":  # if the file cant be found if there is an error
    print("Could not open", risk, "file")
    risk = input("\nPlease try to open file again: ")
else:
    with open("riskfactors.csv") as f:
        reader = csv.reader(f, delimiter=' ', quotechar='|')

        data = []
        for row in reader:# Number of rows including the death rates 
            for col in (2,4): # The columns I want read   B and D
                data.append(row)
                data.append(col)
        for item in data:
            print(item) #print the rows and columns

我只需要读取所有统计信息的 B 列和 D 列即可:

  A              B                D                    
 State  |Heart Disease Rate| HIV Diagnosis Rate |

 Alabama       235.5             16.7                

  Alaska       147.9             3.2                     

  Arizona      152.5             11.9                     

  Arkansas     221.8             10.2                    

 California    177.9             N/A                     

 Colorado      145.3             8.4                

已编辑

没有错误

关于如何解决这个问题的任何想法?我尝试的一切都不起作用。非常感谢任何帮助或建议。

4

3 回答 3

11

我希望你听说过 Pandas for Data Analysis。

以下代码将完成读取列的工作,但是关于读取行,您可能需要解释更多。

import pandas
io = pandas.read_csv('test.csv',sep=",",usecols=(1,2,4)) # To read 1st,2nd and 4th columns
print io 
于 2013-03-08T05:42:46.000 回答
3

如果您仍然卡住,那么您真的没有理由必须使用 CSV 模块读取文件,因为所有 CSV 文件都只是逗号分隔的字符串。所以,对于一些简单的事情,你可以试试这个,它会给你一个表格的元组列表(状态,心脏病率,艾滋病毒诊断率)

output = []

f = open( 'riskfactors.csv', 'rU' ) #open the file in read universal mode
for line in f:
    cells = line.split( "," )
    output.append( ( cells[ 0 ], cells[ 1 ], cells[ 3 ] ) ) #since we want the first, second and third column

f.close()

print output

请注意,如果您想进行任何类型的数据分析,您将不得不通过并忽略标题行。

于 2013-03-08T16:03:25.237 回答
2

试试这个

data = []
for row in reader:# Number of rows including the death rates
    data.append([row[1],row[3]) # The columns I want read  B and D
for item in data
            print(item) #print the rows and columns
于 2013-03-08T04:24:36.797 回答