1

我有这个清单:

mylist = [20, 30, 25, 20, 30]

使用获取重复值索引后

[i for i, x in enumerate(mylist) if mylist.count(x) > 1]

结果是:

`[0, 1, 3, 4]` 

有两对重复值。我想知道,我怎样才能只获得更高的重复值?在此列表中,它是30或它的任何索引,1或者4,而不是重复值的整个列表。

问候...

4

6 回答 6

6

这个是 O(n)

>>> from collections import Counter
>>> mylist = [20, 30, 25, 20, 30]
>>> max(k for k,v in Counter(mylist).items() if v>1)
30
于 2012-06-29T02:45:34.263 回答
5

获取最大重复值:

max(x for x in mylist if mylist.count(x) > 1)

不幸的是,由于重复的 count() 调用,这具有 O(n**2) 性能。这是一种更冗长的方法来执行具有 O(n) 性能的相同操作,如果列表很长,这很重要:

seen = set()
dups = set()
for x in mylist:
    if x in seen:
        dups.add(x)
    seen.add(x)
max_dups = max(dups)
于 2012-06-29T01:37:52.580 回答
1

另一种 O(n) 的做法,只是因为......

>>> from collections import defaultdict
>>> 
>>> mylist = [20,30,25,20,30]
>>> dd = defaultdict(int)
>>> for i in mylist:
...    dd[i] += 1
...
>>> max(i for i in dd if dd[i] > 1)
30

您也可以使用常规的旧字典来执行此操作:

>>> d = dict.fromkeys(mylist, 0)
>>> for i in mylist:
...   d[i] += 1
... 
>>> max(i for i in d if d[i] > 1)
30
于 2012-06-29T02:11:40.870 回答
1

只是一些需要考虑的相对时间:

from collections import Counter
from collections import defaultdict

mylist = [20, 30, 25, 20, 30]

def f1():
    seen = set()
    dups = set()
    for x in mylist:
        if x in seen:
            dups.add(x)
        seen.add(x)
    max_dups = max(dups)

def f2():
    max(x for x in mylist if mylist.count(x) > 1)

def f3():
    max(k for k,v in Counter(mylist).items() if v>1)

def f4():
    dd = defaultdict(int)
    for i in mylist:
        dd[i] += 1

    max(i for i in dd if dd[i] > 1)

def f5():
    d = dict.fromkeys(mylist, 0)            
    for i in mylist:
       d[i] += 1

    max(i for i in d if d[i] > 1)

cmpthese([f1,f2,f3,f4,f5])    

印刷:

   rate/sec     f3     f4     f5     f2     f1
f3   93,653     -- -63.3% -73.0% -79.2% -83.6%
f4  255,137 172.4%     -- -26.3% -43.3% -55.3%
f5  346,238 269.7%  35.7%     -- -23.1% -39.3%
f2  450,356 380.9%  76.5%  30.1%     -- -21.0%
f1  570,419 509.1% 123.6%  64.7%  26.7%     --

所以明智地选择

于 2012-06-29T05:18:00.243 回答
0
$ cat /tmp/1.py
from itertools import groupby

def find_max_repeated(a):
    a = sorted(a, reverse = True)
    for k,g in groupby(a):
        gl = list(g)
        if len(gl) > 1:
            return gl[0]

a = [1,1,2,3,3,4,5,4,6]
print find_max_repeated(a)

$ python /tmp/1.py
4
于 2012-06-29T01:44:03.363 回答
0
mylist = [20, 30, 25, 20, 30]
result = max((mylist.count(x), x) for x in set(mylist))
print(result)
>>> (2, 30)

下面是它的工作原理:

  • set(mylist) - 您只从 mylist (20, 30, 25) 创建唯一值
  • 然后使用生成器理解创建元组,其中第一个项目出现该值的次数 ((1, 25), (2, 20), (2, 30))
  • 由于元组是逐项可比较的,因此您可以获得序列中的最大元组,在本例中为 (2, 30) 因为它大于 (2, 20)
于 2020-01-24T00:54:43.217 回答