尝试嵌套列表理解
counts = [(w, i, w.count(i)) for w in word for i in matched_word]
你会拿一个像这样的数组
[('General William Shelton, said the system', 'will', 0),
('General William Shelton, said the system', 'and', 0),
('General William Shelton, said the system', 'in', 0),
('General William Shelton, said the system', 'the', 1),
('General William Shelton, said the system', 'a', 3),
('General William Shelton, said the system', 'A', 0),
('which will provide more precise positional data', 'will', 1),
('which will provide more precise positional data', 'and', 0),
('which will provide more precise positional data', 'in', 0),
('which will provide more precise positional data', 'the', 0),
('which will provide more precise positional data', 'a', 3),
('which will provide more precise positional data', 'A', 0),
('and that newer technology will provide more', 'will', 1),
('and that newer technology will provide more', 'and', 1),
('and that newer technology will provide more', 'in', 0),
('and that newer technology will provide more', 'the', 0),
('and that newer technology will provide more', 'a', 2),
('and that newer technology will provide more', 'A', 0),
('Commander of the Air Force Space Command', 'will', 0),
('Commander of the Air Force Space Command', 'and', 2),
('Commander of the Air Force Space Command', 'in', 0),
('Commander of the Air Force Space Command', 'the', 1),
('Commander of the Air Force Space Command', 'a', 3),
('Commander of the Air Force Space Command', 'A', 1),
('objects and would become the most accurate metadata', 'will', 0),
('objects and would become the most accurate metadata', 'and', 1),
('objects and would become the most accurate metadata', 'in', 0),
('objects and would become the most accurate metadata', 'the', 1),
('objects and would become the most accurate metadata', 'a', 6),
('objects and would become the most accurate metadata', 'A', 0)]
然后你可以使用groupby
fromitertools
groupped = groupby(counts, lambda i: i[0])
最后
for category, items in groupped:
print category, '\n', "\n".join([":".join(map(str, j[1:])) for j in list(items)])