我有这个清单:
mylist = [20, 30, 25, 20, 30]
使用获取重复值索引后
[i for i, x in enumerate(mylist) if mylist.count(x) > 1]
结果是:
`[0, 1, 3, 4]`
有两对重复值。我想知道,我怎样才能只获得更高的重复值?在此列表中,它是30
或它的任何索引,1
或者4
,而不是重复值的整个列表。
问候...
这个是 O(n)
>>> from collections import Counter
>>> mylist = [20, 30, 25, 20, 30]
>>> max(k for k,v in Counter(mylist).items() if v>1)
30
获取最大重复值:
max(x for x in mylist if mylist.count(x) > 1)
不幸的是,由于重复的 count() 调用,这具有 O(n**2) 性能。这是一种更冗长的方法来执行具有 O(n) 性能的相同操作,如果列表很长,这很重要:
seen = set()
dups = set()
for x in mylist:
if x in seen:
dups.add(x)
seen.add(x)
max_dups = max(dups)
另一种 O(n) 的做法,只是因为......
>>> from collections import defaultdict
>>>
>>> mylist = [20,30,25,20,30]
>>> dd = defaultdict(int)
>>> for i in mylist:
... dd[i] += 1
...
>>> max(i for i in dd if dd[i] > 1)
30
您也可以使用常规的旧字典来执行此操作:
>>> d = dict.fromkeys(mylist, 0)
>>> for i in mylist:
... d[i] += 1
...
>>> max(i for i in d if d[i] > 1)
30
只是一些需要考虑的相对时间:
from collections import Counter
from collections import defaultdict
mylist = [20, 30, 25, 20, 30]
def f1():
seen = set()
dups = set()
for x in mylist:
if x in seen:
dups.add(x)
seen.add(x)
max_dups = max(dups)
def f2():
max(x for x in mylist if mylist.count(x) > 1)
def f3():
max(k for k,v in Counter(mylist).items() if v>1)
def f4():
dd = defaultdict(int)
for i in mylist:
dd[i] += 1
max(i for i in dd if dd[i] > 1)
def f5():
d = dict.fromkeys(mylist, 0)
for i in mylist:
d[i] += 1
max(i for i in d if d[i] > 1)
cmpthese([f1,f2,f3,f4,f5])
印刷:
rate/sec f3 f4 f5 f2 f1
f3 93,653 -- -63.3% -73.0% -79.2% -83.6%
f4 255,137 172.4% -- -26.3% -43.3% -55.3%
f5 346,238 269.7% 35.7% -- -23.1% -39.3%
f2 450,356 380.9% 76.5% 30.1% -- -21.0%
f1 570,419 509.1% 123.6% 64.7% 26.7% --
所以明智地选择
$ cat /tmp/1.py
from itertools import groupby
def find_max_repeated(a):
a = sorted(a, reverse = True)
for k,g in groupby(a):
gl = list(g)
if len(gl) > 1:
return gl[0]
a = [1,1,2,3,3,4,5,4,6]
print find_max_repeated(a)
$ python /tmp/1.py
4
mylist = [20, 30, 25, 20, 30]
result = max((mylist.count(x), x) for x in set(mylist))
print(result)
>>> (2, 30)
下面是它的工作原理: