8

基本上,如果给出一个列表:

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]

我正在尝试创建一个返回如下列表的函数:

["apple", "pear", "banana", "cherry"]

我试图让返回列表首先按最常出现的单词排序,同时通过按字母顺序排列它们来打破平局。我也试图消除重复。

我已经列出了每个元素的计数和数据中每个元素的索引。

x = [n.count() for n in data]
z = [n.index() for n in data]

我不知道从这一点开始。

4

3 回答 3

16

你可以这样做:

from collections import Counter

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]

counts = Counter(data)
words = sorted(counts, key=lambda word: (-counts[word], word))

print words
于 2013-04-15T02:16:12.500 回答
3

对于您可以使用的按频率排序元素,请参阅此处collections.most_common的文档,例如

from collections import Counter

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]
print Counter(data).most_common()
#[('apple', 3), ('pear', 2), ('cherry', 1), ('banana', 1)]

感谢@Yuushi,

from collections import Counter

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]
x =[a for (a, b) in Counter(data).most_common()]

print x
#['apple', 'pear', 'cherry', 'banana']
于 2013-04-15T02:17:37.380 回答
0

这是一个简单的方法,但它应该有效。

data = ["apple", "pear", "cherry", "apple", "pear", "apple", "banana"]

from collections import Counter
from collections import defaultdict

my_counter = Counter(data)

# creates a dictionary with keys
# being numbers of occurrences and
# values being lists with strings
# that occured a given time
my_dict = defaultdict(list)
for k,v in my_counter.iteritems():
    my_dict[v].append(k)

my_list = []

for k in sorted(my_dict, reverse=True):
    # This is the second tie-break, if both
    # strings showed up the same number of times
    # and correspond to the same key, we sort them
    # by the alphabetical order
    my_list.extend(sorted(my_dict.get(k))) 

结果:

>>> my_list
['apple', 'pear', 'banana', 'cherry']
于 2013-04-15T02:45:43.377 回答