python - 为什么那个版本的归并排序更快

Question

基于这个答案，这里有两个版本的用于合并排序的合并函数。你能帮我理解为什么第二个更快。我已经测试了它的 50000 列表，第二个快 8 倍（要点）。

def merge1(left, right):
    i = j = inv = 0
    merged = []
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            merged.append(left[i])
            i += 1
        else:
            merged.append(right[j])
            j += 1
            inv += len(left[i:])

    merged += left[i:]
    merged += right[j:]
    return merged, inv

.

def merge2(array1, array2):
    inv = 0
    merged_array = []
    while array1 or array2:
        if not array1:
            merged_array.append(array2.pop())
        elif (not array2) or array1[-1] > array2[-1]:
            merged_array.append(array1.pop())
            inv += len(array2)
        else:
            merged_array.append(array2.pop())
    merged_array.reverse()
    return merged_array, inv

这是排序功能：

def _merge_sort(list, merge):
    len_list = len(list)
    if len_list < 2:
        return list, 0
    middle = len_list / 2
    left, left_inv   = _merge_sort(list[:middle], merge)
    right, right_inv = _merge_sort(list[middle:], merge)
    l, merge_inv = merge(left, right)
    inv = left_inv + right_inv + merge_inv
    return l, inv

.

import numpy.random as nprnd
test_list = nprnd.randint(1000, size=50000).tolist()

test_list_tmp = list(test_list) 
merge_sort(test_list_tmp, merge1)

test_list_tmp = list(test_list) 
merge_sort(test_list_tmp, merge2)

score 10 · Accepted Answer

与上述kreativitea类似的答案，但有更多信息（我认为！）

所以剖析实际的合并函数，用于合并两个 50K 数组，

合并 1

         311748 function calls in 15.363 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001   15.363   15.363 <string>:1(<module>)
        1   15.322   15.322   15.362   15.362 merge.py:3(merge1)
   221309    0.030    0.000    0.030    0.000 {len}
    90436    0.010    0.000    0.010    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

合并2

         250004 function calls in 0.104 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.104    0.104 <string>:1(<module>)
        1    0.074    0.074    0.103    0.103 merge.py:20(merge2)
    50000    0.005    0.000    0.005    0.000 {len}
   100000    0.010    0.000    0.010    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000    0.014    0.000    0.014    0.000 {method 'pop' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'reverse' of 'list' objects}

所以对于 merge1，它是 221309 len、 90436 append，需要 15.363 秒。所以对于 merge2，它是 50000 len、 100000append和 100000pop并且需要 0.104 秒。

len并且append pop都是 O(1)（更多信息在这里），所以这些配置文件并没有显示实际花费的时间，因为只是这样，它应该更快，但只有 ~20%。

好吧，如果您只是阅读代码，原因实际上是相当明显的：

在第一种方法中，有这一行：

inv += len(left[i:])

所以每次调用时，它都必须重建一个数组。如果您注释掉这一行（或者只是将其替换为inv += 1或其他内容），那么它会比其他方法更快。这是导致时间增加的单行。

注意到这是原因后，可以通过改进代码来解决问题；将其更改为此以加快速度。这样做之后，它会比merge2

inv += len(left) - i

将其更新为：

def merge3(left, right):
    i = j = inv = 0
    merged = []
    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            merged.append(left[i])
            i += 1
        else:
            merged.append(right[j])
            j += 1
            inv += len(left) - i

    merged += left[i:]
    merged += right[j:]
    return merged, inv

score 3 · Accepted Answer

您可以使用出色的 cProfile 模块来帮助您解决此类问题。

>>> import cProfile
>>> a = range(1,20000,2)
>>> b = range(0,20000,2)
>>> cProfile.run('merge1(a, b)')
         70002 function calls in 0.195 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.184    0.184    0.195    0.195 <pyshell#7>:1(merge1)
        1    0.000    0.000    0.195    0.195 <string>:1(<module>)
    50000    0.008    0.000    0.008    0.000 {len}
    19999    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


>>> cProfile.run('merge2(a, b)')
         50004 function calls in 0.026 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.016    0.016    0.026    0.026 <pyshell#12>:1(merge2)
        1    0.000    0.000    0.026    0.026 <string>:1(<module>)
    10000    0.002    0.000    0.002    0.000 {len}
    20000    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    20000    0.005    0.000    0.005    0.000 {method 'pop' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'reverse' of 'list' objects}

稍微看了一下信息，看起来评论者是正确的——它不是 len 函数——它是字符串模块。比较事物的长度时会调用字符串模块，如下所示：

>>> cProfile.run('0 < len(c)')
         3 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

切片列表时也会调用它，但这是一个非常快速的操作。

>>> len(c)
20000000
>>> cProfile.run('c[3:2000000]')
         2 function calls in 0.011 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.011    0.011    0.011    0.011 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

TL;DR：字符串模块中的某些内容在您的第一个函数中花费了 0.195 秒，在您的第二个函数中花费了 0.026 秒。：显然，在inv += len(left[i:])这一行中重建数组。

score 0 · Accepted Answer

如果我不得不猜测，我会说这可能与从列表中删除元素的成本有关，从末尾删除（pop）比从开头删除要快。第二个有利于从列表末尾删除元素。

请参阅性能说明： http ://effbot.org/zone/python-list.htm

“移除一个项目所需的时间与在同一位置插入一个项目所需的时间大致相同；最后移除项目快，开头移除项目慢。”

python - 为什么那个版本的归并排序更快

3 回答 3

合并 1

合并2

好吧，如果您只是阅读代码，原因实际上是相当明显的：

将其更新为：

Related

Reference