我有变量'actorslist'及其输出100行(每部电影一行):
[u'Tim Robbins', u'Morgan Freeman', u'Bob Gunton', u'William Sadler']
[u'Christian Bale', u'Heath Ledger', u'Aaron Eckhart', u'Michael Caine']
etc.
然后我有:
pairslist = list(itertools.permutations(actorslist, 2))
这给了我成对的演员,但只在特定的电影中,然后在新的台词之后进入下一部电影。我怎样才能让它在一个大数组中输出所有电影中的所有演员?这个想法是两个一起在电影中的演员应该获得 pydot 优势。
我输入了这个,它成功输出到一个点文件,但没有输出正确的数据。
graph = pydot.Dot(graph_type='graph', charset="utf8")
for i in pairslist:
edge = pydot.Edge(i[0], i[1])
graph.add_edge(edge)
graph.write('dotfile.dot')
我的预期输出应如下所示,点文件 (A,B) 与 (B,A) 相同,因此输出中不存在:
"Tim Robbins" -- "Morgan Freeman";
"Tim Robbins" -- "Bob Gunton";
"Tim Robbins" -- "William Sadler";
"Morgan Freeman" -- "Bob Gunton";
"Morgan Freeman" -- "William Sadler";
"Bob Gunton" -- "William Sadler";
"Christian Bale" -- "Heath Ledger";
"Christian Bale" -- "Aaron Eckhart";
"Christian Bale" -- "Michael Caine";
"Heath Ledger" -- "Aaron Eckhart";
"Heath Ledger" -- "Michael Caine";
"Aaron Eckhart" -- "Michael Caine";
附加信息:
有些人对如何actorslist
创建变量感兴趣:
file = open('input.txt','rU') ###input is JSON data on each line{"Title":"Shawshank...
nfile = codecs.open('output.txt','w','utf-8')
movie_actors = []
for line in file:
line = line.rstrip()
movie = json.loads(line)
l = []
title = movie['Title']
actors = movie['Actors']
tempactorslist = actors.split(',')
actorslist = []
for actor in tempactorslist:
actor = actor.strip()
actorslist.append(actor)
l.append(title)
l.append(actorslist)
row = l[0] + '\t' + json.dumps(l[1]) + '\n'
nfile.writelines(row)