python - 将一组值与文本文件中的另一组值匹配

Question

我有一个包含以下信息的文本文件：

1961 - Roger (Male)
1962 - Roger (Male)
1963 - Roger (Male)
1963 - Jessica (Female)
1964 - Jessica (Female)
1965 - Jessica (Female)
1966 - Jessica (Female)

如果我想在文件中搜索单词“Roger”，我希望它打印出该名称的相应年份，即 1961、1962、1963。解决此问题的最佳方法是什么？

我是用字典做的，但后来意识到字典不能有重复的值，并且 1963 在文本文件中被提到了两次，所以它不起作用。

我正在使用 Python 3，谢谢。

score 2 · Accepted Answer

使用以名称为键的字典并将年份存储在列表中：

In [1]: with open("data1.txt") as f:
   ...:     dic={}
   ...:     for line in f:
   ...:         spl=line.split()
   ...:         dic.setdefault(spl[2],[]).append(int(spl[0]))
   ...:     for name in dic :    
   ...:         print (name,dic[name])
   ...:       

Roger [1961, 1962, 1963]
Jessica [1963, 1964, 1965, 1966]

或者您也可以使用collections.defaultdict：

In [2]: from collections import defaultdict

In [3]: with open("data1.txt") as f:
   ...:     dic=defaultdict(list)
   ...:     for line in f:
   ...:         
   ...:         spl=line.split()
   ...:         dic[spl[2]].append(int(spl[0]))
   ...:     for name in dic:    
   ...:         print name,dic[name]
   ...:         
Roger [1961, 1962, 1963]
Jessica [1963, 1964, 1965, 1966]

score 0 · Accepted Answer

Why can't you use a dict and index on name (eg. Roger) as key and have values as a list of years (here [1961,1962,1963] ? would that not work for you?

so at the end of the loop you get all names uniquified with the years as values which is what you seem to want.

score 0 · Accepted Answer

正如我在评论中建议的那样：

from collections import defaultdict

result = defaultdict(list)
with open('data.txt', 'rt') as input:
    for line in input:
        year, person = [item.strip() for item in line.split('-')]
        result[person].append(year)

for person, years in result.items():
    print(person, years, sep=': ')

输出：

Roger (Male): ['1961', '1962', '1963']
Jessica (Female): ['1963', '1964', '1965', '1966']

score 0 · Accepted Answer

使用元组。它们可以存储在列表中，并进行迭代。

假设您的列表如下所示：

data = [(1961, 'Rodger', 'Male'),
        (1962, 'Rodger', 'Male'),
        (1963, 'Rodger', 'Male'),
        (1963, 'Jessica', 'Female')]

您可以像这样对其运行查询：

# Just items where the name is Rodger
[(y, n, s) for y, n, s in data if n == "Rodger"]

# Just the year 1963
[(y, n, s) for y, n, s in data if y == 1963]

或者使用更多 Pythonic 代码：

for year, name, sex in data:
    if year >= 1962:
        print "In {}, {} was {}".format(year, name, sex)

1962 年，罗杰是男性
1963 年，罗杰是男性
1963 年，杰西卡是女性

score 0 · Accepted Answer

您始终可以使用正则表达式。

import re

f = open('names.txt')
name = 'Roger'

for line in f.readlines():
    match = re.search(r'([0-9]+) - %s' % name, line)
    if match:
        print match.group(1)

python - 将一组值与文本文件中的另一组值匹配

5 回答 5

Related

Reference