我的回答只会返回一个字典,其中包含'title',['offender1',...]
不止一次看过的电影对,即不会 'E.T. the Extra-Terrestrial (1982)'
但'Return of the Jedi (1983)'
会被报道。这可以通过简单地返回overlaps
解决方案而不是字典理解的结果来改变。
其中 d 是:
d = {'25': {'Return of the Jedi (1983)': 5.0},
'42': {'Batman (1989)': 3.0, 'E.T. the Extra-Terrestrial (1982)': 5.0},
'8': {'Return of the Jedi (1983)': 5.0 },
'542': {'Alice in Wonderland (1951)': 3.0, 'Blade Runner (1982)': 4.0},
'7': {'Alice in Wonderland (1951)': 3.0,'Blade Runner (1982)': 4.0}
}
以下:
from collections import defaultdict
import itertools
def findOverlaps(d):
overlaps = defaultdict(list)
for (parentKey,children) in d.items(): #children is the dictionary containing movie_title,rating pairs
for childKey in children.keys(): #we're only interested in the titles not the ratings, hence keys() not items()
overlaps[childKey].append(parentKey) #add the parent 'id' where the movie_title came from
return dict(((overlap,offenders) for (overlap,offenders) in overlaps.items() if len(offenders) > 1)) #return a dictionary, only if the movie title had more than one 'id' associated with it
print(findOverlaps(d))
产生:
>>>
{'Blade Runner (1982)': ['7', '542'], 'Return of the Jedi (1983)': ['25', '8'], 'Alice in Wonderland (1951)': ['7', '542']}
代码背后的原因:
d 中的每个条目代表id : { movie_title1: rating, movie_title2: rating }
. 现在说movie_title1
发生在与两个或多个单独键关联的值中。我们想要存储 id
move_title
看过两次或多次的电影。
- 的
id
键,与观看电影的值相关联。
因此我们想要一个像这样的结果字典
{ move_title1: {'id1','id2'}, movie_title2: {'id2','id5'}