python - Python：使用 Excel CSV 文件仅读取某些列和行

Question

虽然我可以读取 csv 文件而不是读取整个文件，但我怎样才能只打印某些行和列？

想象一下，好像这是 Excel：

  A              B              C                  D                    E
State  |Heart Disease Rate| Stroke Death Rate | HIV Diagnosis Rate |Teen Birth Rate

Alabama     235.5             54.5                 16.7                 18.01

Alaska      147.9             44.3                  3.2                  N/A    

Arizona     152.5             32.7                 11.9                  N/A    

Arkansas    221.8             57.4                 10.2                  N/A    

California  177.9             42.2                  N/A                  N/A    

Colorado    145.3             39                    8.4                 9.25

这是我所拥有的：

import csv

try:
    risk = open('riskfactors.csv', 'r', encoding="windows-1252").read() #find the file

except:
    while risk != "riskfactors.csv":  # if the file cant be found if there is an error
    print("Could not open", risk, "file")
    risk = input("\nPlease try to open file again: ")
else:
    with open("riskfactors.csv") as f:
        reader = csv.reader(f, delimiter=' ', quotechar='|')

        data = []
        for row in reader:# Number of rows including the death rates 
            for col in (2,4): # The columns I want read   B and D
                data.append(row)
                data.append(col)
        for item in data:
            print(item) #print the rows and columns

我只需要读取所有统计信息的 B 列和 D 列即可：

  A              B                D                    
 State  |Heart Disease Rate| HIV Diagnosis Rate |

 Alabama       235.5             16.7                

  Alaska       147.9             3.2                     

  Arizona      152.5             11.9                     

  Arkansas     221.8             10.2                    

 California    177.9             N/A                     

 Colorado      145.3             8.4

已编辑

没有错误

关于如何解决这个问题的任何想法？我尝试的一切都不起作用。非常感谢任何帮助或建议。

score 11 · Accepted Answer

我希望你听说过 Pandas for Data Analysis。

以下代码将完成读取列的工作，但是关于读取行，您可能需要解释更多。

import pandas
io = pandas.read_csv('test.csv',sep=",",usecols=(1,2,4)) # To read 1st,2nd and 4th columns
print io

score 3 · Accepted Answer

如果您仍然卡住，那么您真的没有理由必须使用 CSV 模块读取文件，因为所有 CSV 文件都只是逗号分隔的字符串。所以，对于一些简单的事情，你可以试试这个，它会给你一个表格的元组列表（状态，心脏病率，艾滋病毒诊断率）

output = []

f = open( 'riskfactors.csv', 'rU' ) #open the file in read universal mode
for line in f:
    cells = line.split( "," )
    output.append( ( cells[ 0 ], cells[ 1 ], cells[ 3 ] ) ) #since we want the first, second and third column

f.close()

print output

请注意，如果您想进行任何类型的数据分析，您将不得不通过并忽略标题行。

score 2 · Accepted Answer

试试这个

data = []
for row in reader:# Number of rows including the death rates
    data.append([row[1],row[3]) # The columns I want read  B and D
for item in data
            print(item) #print the rows and columns

python - Python：使用 Excel CSV 文件仅读取某些列和行

已编辑

3 回答 3

Related

Reference