python - 为什么这个涉及 list.index() 调用的 lambda 这么慢？

Question

使用 cProfile：

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   17.834   17.834 <string>:1(<module>)
        1    0.007    0.007   17.834   17.834 basher.py:5551(_refresh)
        1    0.000    0.000   10.522   10.522 basher.py:1826(RefreshUI)
        4    0.024    0.006   10.517    2.629 basher.py:961(PopulateItems)
      211    1.494    0.007    7.488    0.035 basher.py:1849(PopulateItem)
      231    0.074    0.000    6.734    0.029 {method 'sort' of 'list' objects}
      215    0.002    0.000    6.688    0.031 bosh.py:4764(getOrdered)
     1910    3.039    0.002    6.648    0.003 bosh.py:4770(<lambda>)
      253    0.178    0.001    5.600    0.022 bosh.py:3325(getStatus)
        1    0.000    0.000    5.508    5.508 bosh.py:4327(refresh)
     1911    3.051    0.002    3.330    0.002 {method 'index' of 'list' objects}

这1910 3.039 0.002 6.648 0.003 bosh.py:4770(<lambda>)条线让我很困惑。在 bosh.py:4770 我有modNames.sort(key=lambda a: (a in data) and data.index(a)), data 和 modNames 是列表。注意1911 3.051 0.002 3.330 0.002 {method 'index' of 'list' objects}似乎来自这条线。

那么为什么这么慢呢？有什么办法可以重写它sort()以使其执行得更快？

编辑：我缺少了解这个 lambda 的最后一种成分：

>>> True and 3
3

score 4 · Accepted Answer

正如 YardGlassOfCode 所说，它lambda本身不是很慢，而是 lambda 内部的 O(n) 操作很慢。a in data和都是运算，其中data.index(a)是的长度。作为对效率的又一侮辱，对的调用也重复了许多已完成的工作。如果其中的项目是可散列的，那么您可以通过首先准备一个字典来大大加快速度：O(n)ndataindexa in datadata

weight = dict(zip(data, range(len(data))))
modNames.sort(key=weight.get)  # Python2, or
modNames.sort(key=lambda a: weight.get(a, -1))  # works in Python3

这要快得多，因为每个 dict 查找都是一个O(1)操作。

请注意，modNames.sort(key=weight.get)依赖于 None 比较小于整数：

In [39]: None < 0
Out[39]: True

在 Python3 中，None < 0引发TypeError. 所以用于在不在lambda a: weight.get(a, -1)时返回 -1 。aweight

python - 为什么这个涉及 list.index() 调用的 lambda 这么慢？

1 回答 1

Related

Reference