-2

以下代码是我为从网站读取数据并将其存储在列表中而编写的。该代码有效,但无论如何它也会引发列表超出范围错误。谁能解释我做错了什么?

import urllib.request

data_url = "http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data"
aboveFifty = 0
belowFifty = 0

""" The variables for storage """
age = 0
worksFor = ""
college = ""
salary = ""
bools = True

try:
    print("Retrieving the data... ")
    local_file, headers = urllib.request.urlretrieve(data_url)
    print("Data retrieved")
    fh = open(local_file, "r")
    print("Reading the file... ")

    for row in fh:
        table = [row.strip().split(" ")]
        salary = table[0][14]

        if bools == True:
            print("Table: ", table)
            bools = False

        if salary == "<=50K":
            belowFifty += 1
        elif salary == ">50K":
            aboveFifty += 1

except IOError as e:
    print("IO Error: ", e)
except IndexError as ie:
    print("Index error: ", ie)

print("Above fifty: ", aboveFifty, "Below fifty: ", belowFifty)
fh.close()

我得到的回溯错误是:

Traceback (most recent call last):
  File "C:\Users\Killian\workspace\College\Assignment.py", line 25, in <module>
    salary = table[0][14]
IndexError: string index out of range
4

1 回答 1

1

您的数据已损坏。具体来说,数据文件末尾有一个空行。您可以像这样处理损坏的数据:

for row in fh:
    table = [row.strip().split(" ")]
    if not table:
        continue    # <-- ignore blank lines
    salary = table[0][14]
于 2013-10-18T17:19:12.793 回答