-3

我有一个元组列表(如下所示),我需要加入每个元组中的第一项。所以结果将是一个 (word, list(numbers)) 元组的列表。

In [351]: word_docid_pairs
Out[351]: 
[('bear', 1),
('is', 1),
('in', 1),
('gugledarc', 1),
('the', 1),
('sdpij', 2),
('emdf', 2),
('sai', 2),
('sd', 3),
('fuggle', 4),
('in', 4),
('gugledarc', 4),
('df', 4)]
4

1 回答 1

1
Python 2.7.3 (default, Sep 26 2012, 21:51:14) 
>>> ll = [('bear', 1),
... ('is', 1),
... ('in', 1),
... ('gugledarc', 1),
... ('the', 1),
... ('sdpij', 2),
... ('emdf', 2),
... ('sai', 2),
... ('sd', 3),
... ('fuggle', 4),
... ('in', 4),
... ('gugledarc', 4),
... ('df', 4)]
>>> dd = {}
>>> for key, value in ll:
...     dd.setdefault(key, []).append(value)
... 
>>> dd.items()
[('sai', [2]), ('emdf', [2]), ('df', [4]), ('is', [1]), ('bear', [1]), ('gugledarc', [1, 4]), ('in', [1, 4]), ('the', [1]), ('sdpij', [2]), ('fuggle', [4]), ('sd', [3])]

正如所建议的,这是另一个使用defaultdict

>>> from collections import defaultdict
>>> dd = defaultdict(list)
>>> for key, value in ll:
...     dd[key].append(value)
... 
>>> dd.items()
[('sai', [2]), ('emdf', [2]), ('df', [4]), ('is', [1]), ('bear', [1]), ('gugledarc', [1, 4]), ('in', [1, 4]), ('the', [1]), ('sdpij', [2]), ('fuggle', [4]), ('sd', [3])]
于 2012-12-30T07:41:15.913 回答