0

我有一个数据列表,它们的格式如下:(下面有更多行只是其中的一部分)

2   377222  TOYOTA MOTOR CORPORATION    TOYOTA  PASEO   1994    Y   19941226    N   0   0   PARKING BRAKE:CONVENTIONAL  SAN JOSE        CA  JT2EL45U5R0 19950103    19950103        1   PARKED ON FLAT SURFACE EMERGENCY BRAKING ENGAGED VEHICLE ROLLED REARWARD.  TT   EVOQ                                                                                                    V           
1   958164  TOYOTA MOTOR CORPORATION    TOYOTA  LAND CRUISER    1994        19941223    N   0   0   SERVICE BRAKES, HYDRAULIC:ANTILOCK  ARNOLD          CA  JT3DJ81W8R0 19950103    19950103            ABS SYSTEM FAILURE, AT 20MPH.  TT   EVOQ                                                                                                    V
46  958153  DAIMLERCHRYSLER CORPORATION DODGE   CARAVAN 1987        19940901    N   0   0   EQUIPMENT:MECHANICAL:CARRIER/RACK   CORBETT         OR  2B4FK4130HR 19950103    19950103        1   CABLE ATTACHMENT THAT SECURES THE SPARE TIRE BROKE WHILE DRIVING.  TT   EVOQ                                                                                                    V   
98  958178  GENERAL MOTORS CORP.    GMC SAFARI  1994        19941223    N   0   0   SERVICE BRAKES, HYDRAULIC:FOUNDATION COMPONENTS MILAN           MI  1GDDM19W4RB 19950103    19950103        1   BRAKES FAILED DUE TO BATTERY MALFUNCTIONING WHEN TOO MUCH POWER WAS DRAWN FROM BATTERY FOR RADIO.   TT  EVOQ                                                                                                    V   

使用 index(1) 整数作为键和句子中任何其他 2 个元素的元组作为值来创建字典的最佳方法是什么?所需的输出应该是这样的:

function(filename)[2]
('TOTOTA MOTOR CORPORATION','19941226','SAN JOSE','CA')

这是我现在所拥有的,我试图先将它们全部放入字典中,但它不会遍历整个列表,而是只返回一行的元素。我的代码出了什么问题?或者我如何至少完成第一步 - 将它们全部放入字典中?

def function(filename):
    with open filename as FileObject:
        A=[]
        for lines in FileObject:
            B=[line.split("\t")[0]]
            A+=B
            C=[line.split("\t")[2]]
            A=A+B+C
            D=[line.split("\t")[12]]
            A=A+B+C+D
            E={A:(B,C,D)for A in A}
    return E
print function(filename)
4

2 回答 2

2

每次通过循环 ( ) 时,您都在创建一个新字典(而不是添加到其中E={A:(B,C,D)for A in A})。在进入循环之前声明您的字典,并在每次通过循环时添加您的条目​​。

def create_database(f)
    """ Returns a populated dictionary.  Iterates over the input 'f'. """
    data = {}
    for line in f:
        # add stuff to data
        key, datum = parse_line(line)
        data[key] = datum
    return data
于 2012-10-29T19:51:16.523 回答
1

通过使用csv模块(可用于处理制表符分隔的文件)并可能operator.itemgetter作为便利功能。

with open('yourfile') as fin:
    tabin = csv.reader(fin, delimiter='\t')
    # change itemgetter to include the relevant column indices
    your_dict = {int(row[0]): itemgetter(2, 12)(row) for row in tabin}

print your_dict[2]
于 2012-10-29T19:56:23.267 回答