2

我正在尝试为学校编写一个程序。我是生物技术专业的,这是一门必修课,但我不是程序员。所以,这对很多人来说可能很容易,但对我来说很难。无论如何,我有一个大约 30 行的文本文件。每一行都有一个首先列出的电影名称和出现在电影中的演员,后面用逗号分隔。这是我到目前为止所拥有的:

InputName = input('What is the name of the file? ')
File = open(InputName, 'r+').readlines()


ActorLst = []
for line in File:
    MovieActLst = line.split(',')   


    Movie = MovieActLst[0]        
    Actors = MovieActLst[1:]      
    for actor in Actors:
        if actor not in ActorLst:
            ActorLst.append(actor)

    MovieDict = {Movie: Actors for x in MovieActLst} 
    print (MovieDict)
    print(len(MovieDict))

输出(缩短):

What is the name of the file? Movies.txt
{"Ocean's Eleven": ['George Clooney', 'Brad Pitt', 'Elliot Gould', 'Casey Affleck', 'Carl Reiner', 'Julia Roberts', 'Angie Dickinson', 'Steve Lawrence', 'Wayne Newton\n']}
1
{'Up in the Air': ['George Clooney', 'Sam Elliott', 'Jason Bateman\n']}
1
{'Iron Man': ['Robert Downey Jr', 'Jeff Bridges', 'Gwyneth Paltrow\n']}
1
{'The Big Lebowski': ['Jeff Bridges', 'John Goodman', 'Julianne Moore', 'Sam Elliott\n']}
1

我创建了一个字典 ( MovieDict),其中包含作为键的电影名称和作为值的演员列表。大约有 30 个电影名称(键)。我需要弄清楚如何遍历这本字典以从本质上扭转它。我想要一个字典,其中包含一个演员作为键,他们播放的电影作为值。

但是,我想我也创建了一个字典列表而不是一个字典,现在我真的很困惑自己!有什么建议么?

4

5 回答 5

5

微不足道的使用collections.defaultdict

from collections import defaultdict
reverse = defaultdict(list)

for movie, actors in MovieDict.items():
    for actor in actors:
        reverse[actor].append(movie)

该类的defaultdict不同之处在于,dict当您尝试访问不存在的键时,它会创建它并将其值设置为工厂创建的项目,并传递给构造函数(list在上面的代码中),这样可以避免捕获KeyError或检查关键在字典里。

将其与 Steven Rumbalski 的循环结合会导致:

from collections import defaultdict
in_fname = input('What is the name of the file? ')
in_file = open(in_fname, 'r+')

movie_to_actors = {}
actors_to_movie = defaultdict(list)

for line in in_file:
    #assumes python3:
    movie, *actors = line.strip().split(',')
    #python2 you can do actors=line.strip().split(',');movie=actors.pop(0)

    movie_to_actors[movie] = list(actors)
    for actor in actors:
        actors_to_movie[actor].append(movie)

关于上面代码的一些解释。

遍历文件的行

文件对象是可迭代的,因此支持迭代。这意味着您可以执行以下操作:

for line in open('filename'):

代替:

for line in open('filename').readlines():

(同样在 python2 中,后者读取所有文件然后拆分内容,而迭代文件不会将所有文件读入内存[因此您可以为大文件节省大量 RAM])。

元组拆包

要将序列“解包”成不同的变量,您可以使用“元组解包”语法:

>>> a,b = (0,1)
>>> a
0
>>> b
1

语法被扩展为允许将可变数量的值收集到一个变量中。例如:

>>> head, *tail = (1, 2, 3, 4, 5)
>>> head
1
>>> tail
[2, 3, 4, 5]
>>> first, *mid, last = (0, 1, 2, 3, 4, 5)
>>> first
0
>>> mid
[1, 2, 3, 4]
>>> last
5

你只能有一个“星号表达式”,所以这不起作用:

>>> first, *mid, center, *mid2, last  =(0,1,2,3,4,5)
  File "<stdin>", line 1
SyntaxError: two starred expressions in assignment

所以基本上当你在左边有一颗星时,python 会把它不能放在其他变量中的所有东西放在那里。请注意,这意味着该变量可能引用一个空列表:

>>> first, *mid, last = (0,1)
>>> first
0
>>> mid
[]
>>> last
1

使用默认字典

允许您为不存在的defaultdict键提供默认值。该类接受一个可调用的(~函数或类)作为参数,并在每次需要时调用它来构建一个默认值:

>>> def factory():
...     print("Called!")
...     return None
... 
>>> mydict = defaultdict(factory)
>>> mydict['test']
Called!
于 2012-10-27T16:34:58.393 回答
1

编程是关于抽象事物的,因此请尝试以不依赖于特定问题的方式编写代码。例如:

def csv_to_dict(seq, separator=','):
    dct = {}
    for item in seq:
        data = [x.strip() for x in item.split(separator)]
        if len(data) > 1:
            dct[data[0]] = data[1:]
    return dct

def flip_dict(dct):
    rev = {}
    for key, vals in dct.items():
        for val in vals:
            if val not in rev:
                rev[val] = []
            rev[val].append(key)
    return rev

请注意,这两个函数对“输入文件”、“演员”、“电影”等“一无所知”,但仍然能够用两行代码解决您的问题:

with open("movies.txt") as fp:
    print(flip_dict(csv_to_dict(fp)))
于 2012-10-27T17:09:29.427 回答
1
InputName = input('What is the name of the file? ')
with open(InputName, 'r') as f:
    actors_by_movie = {}
    movies_by_actor = {}
    for line in f:
        movie, *actors = line.strip().split(',')
        actors_by_movie[movie] = actors
        for actor in actors:
            movies_by_actor.setdefault(actor, []).append(movie)
于 2012-10-27T16:27:23.117 回答
1
reverse={}
keys=MovieDict.keys()
for key in keys:
    val=MovieDict[key]
    for actor in val:
        try:
            reverse[actor]=reverse[actor].append(actor)
        except KeyError:
            reverse[actor]=[]
            reverse[actor]=reverse[actor].append(actor)
print(reverse)#retarded python 3 format! :)

那应该这样做。

于 2012-10-27T15:50:23.530 回答
0

根据您的命名约定:

from collections import defaultdict

InputName = input('What is the name of the file? ')
File = open(InputName, 'rt').readlines()

ActorLst = []
ActMovieDct = defaultdict(list)
for line in File:
    MovieActLst = line.strip().split(',')
    Movie = MovieActLst[0]
    Actors = MovieActLst[1:]
    for actor in Actors:
        ActMovieDct[actor].append(Movie)

# print results    
for actor, movies in ActMovieDct.items():
    print(actor, movies)
于 2012-10-27T17:10:01.910 回答