python - python中的多进程快速排序挂起大量元素

Question

为了进一步了解 python 中的进程，我决定使用多个进程来实现快速排序。它似乎在元素少于 ~21K 的列表上工作正常，但任何更大的（比如 25K 元素）它只是挂起。

我试过这个import sys; sys.setrecursionlimit(2**30)伎俩无济于事。这仅仅是因为 python 不喜欢真正的深度递归吗？有趣的是，相同的算法在大约 50k 个元素上作为单个进程运行良好。

import random, multiprocessing, time

DESIRED_PROC_DEPTH = 2
proc_depth = 1 #start at one to count the parent process that kicks this all off

def main():
    import sys; sys.setrecursionlimit(2**30)
    num_elements = 20000
    list_of_numbers = [random.randint(0,num_elements) for num in range(num_elements)]
    # print list_of_numbers
    simple_start = time.time()
    simple_sorted = simple_qs(list_of_numbers)
    simple_total_time = time.time() - simple_start
    print "Using a single thread we sorted " + str(num_elements) + " elements in " + str(simple_total_time) + " seconds"
    the_q = multiprocessing.Queue()
    sorter = MTQuickSort(list_of_numbers, the_q)
    start = time.time()
    sorter.start()
    sorter.join()
    mt_total_time = time.time() - start
    sorted_list = the_q.get()
    # print sorted_list
    print "Sorted " + str(num_elements) + " elements in " + str(mt_total_time) + " seconds"


class MTQuickSort(multiprocessing.Process):

    def __init__(self, list_to_sort, queue):
        super(MTQuickSort, self).__init__()
        print self.name
        self.queue = queue
        self.list = list_to_sort

    def run(self):
        self.queue.put(self.quicksort(self.list))

    def quicksort(self, list):
        global proc_depth
        global DESIRED_PROC_DEPTH
        if len(list) is 0:
            # base case is that list is empty
            return []
        else:
            pivot = list[0]
            less = [x for x in list[1:] if x <= pivot]
            greater = [x for x in list[1:] if x > pivot]
            if proc_depth < DESIRED_PROC_DEPTH:
                proc_depth += 1
                # We create threads in blocks of two since we partition the list into two parts
                lessor_q = multiprocessing.Queue()
                greater_q = multiprocessing.Queue()
                qs_process1 = MTQuickSort(less, lessor_q)
                qs_process2 = MTQuickSort(greater, greater_q)
                qs_process1.start()
                qs_process2.start()
                qs_process1.join()
                qs_process2.join()
                return lessor_q.get() + [pivot] + greater_q.get()
            else:
                less_than_pivot = self.quicksort(less)
                greater_than_pivot = self.quicksort(greater)
                return less_than_pivot + [pivot] + greater_than_pivot

def simple_qs(list):
    if len(list) is 0:
        # base case is that list is empty
        return []
    else:
        pivot = list[0]
        return simple_qs([x for x in list[1:] if x <= pivot]) + [pivot] + simple_qs([x for x in list[1:] if x > pivot])



if __name__ == '__main__':
    main()

score 3 · Accepted Answer

这是由于您在从队列中获取（）结果之前加入（）子进程。显然，队列不会急切地从子进程中读取数据。因此，当您调用 join() 时，调用者的进程会一直等到子进程完成，但子进程本身正在等待尝试将数据发送到管道到其父进程。

如果我在相应的 get() 之后移动各种 join()，它就可以工作。

python - python中的多进程快速排序挂起大量元素

1 回答 1

Related

Reference