python - 列表列表中的频率

Question

我的数据本质上与此处的 SO 帖子相似：

列表列表中项目的频率

但是，相反，我有：

mylist = [['hello', 'there'], ['hi', 'there'], ['hello', 'there']]

我试图计算重复短语的数量，因此，在这种情况下，我观察 ['hello', 'there'] 两次，另一个观察一次。

我遇到了熟悉的TypeError: unhashable type: 'list'错误，但是在我的示例中格式化了数据结构，我无法找到相关的解决方案。

以上可能是由n每个单独列表中的总单词组成的短语，但并非总是如此n=2。

在这种情况下努力获得频率计数，因此任何指导表示赞赏。

score 3 · Accepted Answer

列表不可散列，但元组是：

>>> import collections
>>> counts = collections.Counter([tuple(sublist) for sublist in mylist])
>>> counts
Counter({('hello', 'there'): 2, ('hi', 'there'): 1})

这只是 a 的包装dict，可以这样访问：

>>> counts[("hello", "there")]
2

score 0 · Accepted Answer

没有导入并结合一点字典理解可以实现您的结果。

{k:i for i, k in [[i, mylist.count(i)] for i in mylist]}

输出：

{2: ['hello', 'there'], 1: ['hi', 'there']}

一种非常规的方法，您可以反转键和值。

{f'{", ".join(i)}':v for i, v in [[i, mylist.count(i)] for i in mylist]}

输出：

{'hello, there': 2, 'hi, there': 1}

一圈。

dict([(tuple(i), mylist.count(i))for i in mylist])

输出：

{('hello', 'there'): 2, ('hi', 'there'): 1}

python - 列表列表中的频率

2 回答 2

Related

Reference