python - 如何在消除重复项的同时按出现频率和字母顺序（如果出现平局）组织列表？

Question

基本上，如果给出一个列表：

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]

我正在尝试创建一个返回如下列表的函数：

["apple", "pear", "banana", "cherry"]

我试图让返回列表首先按最常出现的单词排序，同时通过按字母顺序排列它们来打破平局。我也试图消除重复。

我已经列出了每个元素的计数和数据中每个元素的索引。

x = [n.count() for n in data]
z = [n.index() for n in data]

我不知道从这一点开始。

score 16 · Accepted Answer

你可以这样做：

from collections import Counter

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]

counts = Counter(data)
words = sorted(counts, key=lambda word: (-counts[word], word))

print words

score 3 · Accepted Answer

对于您可以使用的按频率排序元素，请参阅此处collections.most_common的文档，例如

from collections import Counter

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]
print Counter(data).most_common()
#[('apple', 3), ('pear', 2), ('cherry', 1), ('banana', 1)]

感谢@Yuushi，

from collections import Counter

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]
x =[a for (a, b) in Counter(data).most_common()]

print x
#['apple', 'pear', 'cherry', 'banana']

score 0 · Accepted Answer

这是一个简单的方法，但它应该有效。

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]

from collections import Counter
from collections import defaultdict

my_counter = Counter(data)

# creates a dictionary with keys
# being numbers of occurrences and
# values being lists with strings
# that occured a given time
my_dict = defaultdict(list)
for k,v in my_counter.iteritems():
    my_dict[v].append(k)

my_list = []

for k in sorted(my_dict, reverse=True):
    # This is the second tie-break, if both
    # strings showed up the same number of times
    # and correspond to the same key, we sort them
    # by the alphabetical order
    my_list.extend(sorted(my_dict.get(k)))

结果：

>>> my_list
['apple', 'pear', 'banana', 'cherry']

python - 如何在消除重复项的同时按出现频率和字母顺序（如果出现平局）组织列表？

3 回答 3

Related

Reference