0

我打开一个文本文件并在所有行上循环,将每一行排序到个人习惯的字典中。

def load(fileName):
    file = open(fileName+'.txt')
    for line in file:
        row = line.split()
        id = int(row[0])
        number = int(row[2])
        values = [int(row[3]),int(row[4]),int(row[5]),int(row[6])]
        dict = {number:[id, values]}
        print(dict)

我想检查下一行是否有重复的值,number然后id根据它进行分组和排序。

我确信一个好的解决方案是将所有字典放入一个列表中,然后以某些方式对其进行操作,但我似乎无法让它工作,它只是将每个字典放在dict不同的列表中。

如何在每次line使用类似于.nextLine()index=0每次迭代时递增的东西时检查重复项line in file

输入示例:

1772 320 548 340 303 20 37 1
1772 320 551 337 306 22 37 1
1772 320 551 337 306 22 37 1
1772 320 551 337 306 22 37 1
1772 320 552 336 307 22 37 1
1772 320 553 335 308 22 37 1
1772 320 554 335 309 20 37 1
1783 345 438 31 436 40 36 1
1783 345 439 33 434 40 36 1
1783 345 440 35 432 40 36 1
1783 345 441 38 430 40 36 1
1783 345 442 39 431 40 36 1
1783 345 443 41 429 40 36 1
1783 345 444 44 428 40 36 1

输出示例:

{548: [1772, [340, 303, 20, 37]]}
{551: [1772, [337, 306, 22, 37]]}
{551: [1772, [337, 306, 22, 37]]}
{551: [1772, [337, 306, 22, 37]]}
{552: [1772, [336, 307, 22, 37]]}
{553: [1772, [335, 308, 22, 37]]}
{554: [1772, [335, 309, 20, 37]]}
{438: [1783, [31, 436, 40, 36]]}
{439: [1783, [33, 434, 40, 36]]}
{440: [1783, [35, 432, 40, 36]]}
{441: [1783, [38, 430, 40, 36]]}
{442: [1783, [39, 431, 40, 36]]}
{443: [1783, [41, 429, 40, 36]]}
{444: [1783, [44, 428, 40, 36]]}
4

3 回答 3

2

只需将您在字典中看到的数字和 ID 保留下来即可。由于两者都必须匹配,您可以将它们分组为一个元组:

def load(fileName):
    dupes_dic = {}
    file = open(fileName+'.txt')
    for line in file:
        row = line.split()
        id = int(row[0])
        number = int(row[2])
        values = [int(row[3]),int(row[4]),int(row[5]),int(row[6])]
        dic = {number:[id, values]}
        if dupes_dic[(number,id)]:
            #do some grouping or sorting or whatever
        else:
            dupes_dic[(number,id)] = values

如果您再解释一下您想要什么,我可以在答案中添加更多内容。

编辑: OP 实际上想要按 ID 排序的具有相同编号的项目。在这种情况下,这应该有效:

from collections import OrderedDict
def load(fileName):
    dupes_dic = {}
    file = open(fileName+'.txt')
    for line in file:
        row = line.split()
        id = int(row[0])
        number = int(row[2])
        values = [int(row[3]),int(row[4]),int(row[5]),int(row[6])]
        if number in dupes_dic:
            dupes_dic[number][id] = values
        else:
            dupes_dic[number] = {id: values}
    for number in dupes_dic:
        dupes_dic[number]['index'] = sorted(dupes_dic[number].keys())

然后,您只需使用每个数字的索引按顺序提取该数字的 ID/值,例如:

def getOrderedIds(number_dic):
    for id, values in number_dic['index'].iterkeys():
        print id
        print values
于 2013-05-30T22:15:00.983 回答
1
d = dict()
with open ("input") as f:
    for line in f:
        line = line.rstrip(" \n")
        row = line.split()
        if len(row) < 7: continue
        idx = int(row[0])
        number = int(row[2])
        values = [int(row[3]),int(row[4]),int(row[5]),int(row[6])]
        key = str(number) + ":" + str(idx)

        # add values corresponding to same number, idx pairs to ...
        # a list referenced by d[number:idx]

        if key not in d: d[key] = []
        d[key].append(values)

for key in d:
    n,i = key.split(":")
    # print out rows with number n and idx i
    for row in d[key]:
        print n, i, ",".join(str(x) for x in row)

输出:

551 1772 337,306,22,37
551 1772 337,306,22,37
551 1772 337,306,22,37
553 1772 335,308,22,37
552 1772 336,307,22,37
548 1772 340,303,20,37
554 1772 335,309,20,37
于 2013-05-31T00:03:57.927 回答
1
from collections import OrderedDict as od
with open("abc") as f:
   dic = od()
   for line in f:
       row  = map(int,line.split())
       idx, num  = row[2], row[0]
       val = [num] + [row[3:-1]]
       dic.setdefault(idx,[]).append(val)

for k,v in dic.items():
    for val in v:
        print k,val

输出:

548 [1772, [340, 303, 20, 37]]
551 [1772, [337, 306, 22, 37]]
551 [1772, [337, 306, 22, 37]]
551 [1772, [337, 306, 22, 37]]
552 [1772, [336, 307, 22, 37]]
553 [1772, [335, 308, 22, 37]]
554 [1772, [335, 309, 20, 37]]
438 [1783, [31, 436, 40, 36]]
439 [1783, [33, 434, 40, 36]]
440 [1783, [35, 432, 40, 36]]
441 [1783, [38, 430, 40, 36]]
442 [1783, [39, 431, 40, 36]]
443 [1783, [41, 429, 40, 36]]
444 [1783, [44, 428, 40, 36]]
于 2013-05-30T22:30:09.557 回答