0

我写了一个这样的程序:

reader=csv.reader(open("lrgdata.csv")) 
headers = reader.next()              
Amt_Wtotal=0             
Amt_Dtotal=0 
dataW =[] 
dataD=[] 
counts_W=defaultdict(int)
counts_D=defaultdict(int)
for row in reader:
   if(row[28]=='W'):
      counts_W[row[5]] += 1
      Amt_Wtotal += float(row[6])
      dataW.append(Amt_Wtotal)
   else:
   counts_D[row[5]] += 1
   Amt_Dtotal += float(row[6])
   dataD.append(Amt_Dtotal)

当我使用 412KB 的文件运行此代码时,我没有收到错误,但是当我使用 1.8MB 的文件运行时,我收到此错误:

if(row[28]=='W'): IndexError: list index out of range

我的文件是这样的

标头

personal_info_id_city,personal_info_sex,transaction_master_id_transaction_master,card_holder_info_id_terminal_info,transaction_master_id_terminal_info,account_info_id_account_info,transaction_master_amount,personal_info_dob_m,card_holder_info_card_issue_dt,personal_info_dob_h,transaction_master_transaction_from,personal_info_dob_d,transaction_master_transacted_on,account_info_balance_amt,personal_info_id_user_type,personal_info_dob_y,card_holder_info_card_issue_dt_y,transaction_master_transacted_on_y,transaction_master_transacted_on_d,card_holder_info_card_issue_dt_d,transaction_master_transacted_on_m,card_holder_info_card_issue_dt_h,transaction_master_transacted_on_h,card_holder_info_card_issue_dt_m,transaction_master_id_customer_info,personal_info_dob,card_holder_info_id_brch,card_holder_info_id_card_holder_info,transaction_master_transaction_type,_id,personal_info_id_customer_info

价值观

2,M,17748,60,60,21768,1460.0,7,2011-04-02 00:00:00,0,B,5,2011-07-22 03:03:00,52.0,1,1992,2011,2011,22,2,7,0,3,4,21768,1992-07-05 00:00:00,26,21768,W,50f38a469cf9c253d600000c,21768

1,M,18002,3,3,1746,3480.0,2,2011-04-07 00:00:00,0,B,5,2011-07-25 01:03:00,123.0,1,1985,2011,2011,25,7,7,0,1,4,1746,1985-02-05 00:00:00,3,1746,D,50f38a469cf9c253d600000d,1746

你能告诉我如何找到两个数据集之间的相关性,这是一个列表吗?

4

1 回答 1

0
for row in reader:
    try:
        if(row[28]=='W'):
            counts_W[row[5]] += 1
            Amt_Wtotal += float(row[6])
            dataW.append(Amt_Wtotal)
        else:
            counts_D[row[5]] += 1
            Amt_Dtotal += float(row[6])
            dataD.append(Amt_Dtotal)
     except:
         #handle the exception here. 
         # "continue" will ignore it and move to the next item in the loop.
         # I suspect what you actually want is to duplicate the "else" clause here.

http://docs.python.org/2/tutorial/errors.html

于 2013-02-22T05:53:20.990 回答