2

在windows中,必须先检查进程是否为main,才能使用multiprocessing,否则会出现死循环。

我试图将进程的名称更改为子进程的名称,以便在我调用的类或函数中使用多处理,但没有运气。这甚至可能吗?到目前为止,我没有使用多处理,除非它使用的是主进程。

如果可能的话,有人可以提供一个示例,说明如何在从更高进程调用的类或函数中使用多处理?谢谢。

编辑:

这是一个示例 - 第一个有效,但所有内容都在 1 个文件中完成:simplemtexample3.py:

import random
import multiprocessing
import math

def mp_factorizer(nums, nprocs):
    #schtze den prozess
    #print __name__
    if __name__ == '__main__':
        out_q = multiprocessing.Queue()
        chunksize = int(math.ceil(len(nums) / float(nprocs)))
        procs = []
        for i in range(nprocs):

            p = multiprocessing.Process(
                    target=worker,            
                    args=(nums[chunksize * i:chunksize * (i + 1)],
                          out_q))
            procs.append(p)
            p.start()

        # Collect all results into a single result dict. We know how many dicts
        # with results to expect.
        resultlist = []
        for i in range(nprocs):
            temp=out_q.get()
            index =0
            #print temp
            for i in temp:
                resultlist.append(temp[index][0][0:])
                index +=1

        # Wait for all worker processes to finish
        for p in procs:
            p.join()
            resultlist2 = [x for x in resultlist if x != []]
        return resultlist2

def worker(nums, out_q):
    """ The worker function, invoked in a process. 'nums' is a
        list of numbers to factor. The results are placed in
        a dictionary that's pushed to a queue.
    """
    outlist = []

    for n in nums:
        newnumber= n*2
        newnumberasstring = str(newnumber)
        if newnumber:
            outlist.append(newnumberasstring)
    out_q.put(outlist)

l = []
for i in range(80):
    l.append(random.randint(1,8))

print mp_factorizer(l, 4)

但是,当我尝试从另一个文件调用 mp_factorizer 时,它不起作用,因为if __name__ == '__main__'

简单示例.py

import random
import multiprocessing
import math

def mp_factorizer(nums, nprocs):
    #schtze den prozess
    #print __name__
    if __name__ == '__main__':
        out_q = multiprocessing.Queue()
        chunksize = int(math.ceil(len(nums) / float(nprocs)))
        procs = []
        for i in range(nprocs):

            p = multiprocessing.Process(
                    target=worker,            
                    args=(nums[chunksize * i:chunksize * (i + 1)],
                          out_q))
            procs.append(p)
            p.start()

        # Collect all results into a single result dict. We know how many dicts
        # with results to expect.
        resultlist = []
        for i in range(nprocs):
            temp=out_q.get()
            index =0
            #print temp
            for i in temp:
                resultlist.append(temp[index][0][0:])
                index +=1

        # Wait for all worker processes to finish
        for p in procs:
            p.join()
            resultlist2 = [x for x in resultlist if x != []]
        return resultlist2

def worker(nums, out_q):
    """ The worker function, invoked in a process. 'nums' is a
        list of numbers to factor. The results are placed in
        a dictionary that's pushed to a queue.
    """
    outlist = []

    for n in nums:
        newnumber= n*2
        newnumberasstring = str(newnumber)
        if newnumber:
            outlist.append(newnumberasstring)
    out_q.put(outlist)

开始implemtexample.py

import simplemtexample as smt
import random

l = []
for i in range(80):
    l.append(random.randint(1,8))

print smt.mp_factorizer(l, 4)
4

2 回答 2

2

if __name__ == '__main__'是强制性的(至少在 Windows 中),如果想使用多处理。

在 windows 中它是这样工作的:对于您要生成的每个工作线程,windows 将自动启动主进程,并再次启动所有需要的文件。但是,只有已启动的第一个进程称为main。这就是阻止执行 mt_factorizer withif __name__ == '__main__'阻止多处理创建无限循环的原因。

所以基本上windows需要读取包含worker的文件,以及worker调用的所有函数——对于每个worker。通过阻止 mt_factorizer,我们确保不会创建额外的工作人员,而窗口仍然可以执行工作人员。这就是为什么在一个文件中包含所有代码的多处理示例会直接阻止创建工作程序(如 mt_factorizer 在这种情况下所做的那样)(但不是工作程序函数),因此 Windows 仍然可以执行工作程序函数。如果所有代码都在一个文件中,并且整个文件都受到保护,则无法创建任何工作人员。

如果多处理代码位于另一个类并被调用,则if __name__ == '__main__'需要在调用的正上方实现:mpteststart.py

import random
import mptest as smt

l = []
for i in range(4):
    l.append(random.randint(1,8))
print "Random numbers generated"
if __name__ == '__main__':
    print smt.mp_factorizer(l, 4)

mptest.py

import multiprocessing
import math

print "Reading mptest.py file"
def mp_factorizer(nums, nprocs):

    out_q = multiprocessing.Queue()
    chunksize = int(math.ceil(len(nums) / float(nprocs)))
    procs = []
    for i in range(nprocs):

        p = multiprocessing.Process(
                target=worker,            
                args=(nums[chunksize * i:chunksize * (i + 1)],
                      out_q))
        procs.append(p)
        p.start()

    # Collect all results into a single result dict. We know how many dicts
    # with results to expect.
    resultlist = []
    for i in range(nprocs):
        temp=out_q.get()
        index =0
        #print temp
        for i in temp:
            resultlist.append(temp[index][0][0:])
            index +=1

    # Wait for all worker processes to finish
    for p in procs:
        p.join()
        resultlist2 = [x for x in resultlist if x != []]
    return resultlist2

def worker(nums, out_q):
    """ The worker function, invoked in a process. 'nums' is a
        list of numbers to factor. The results are placed in
        a dictionary that's pushed to a queue.
    """
    outlist = []

    for n in nums:
        newnumber= n*2
        newnumberasstring = str(newnumber)
        if newnumber:
            outlist.append(newnumberasstring)
    out_q.put(outlist)

在上面的代码中,if __name__ == '__main__'已被删除,因为它已经在调用文件中。

然而,结果有些出乎意料:

Reading mptest.py file
random numbers generated
Reading mptest.py file
random numbers generated
worker started
Reading mptest.py file
random numbers generated
worker started
Reading mptest.py file
random numbers generated
worker started
Reading mptest.py file
random numbers generated
worker started
['1', '1', '4', '1']

多处理被阻止无限执行,但其余代码仍在执行多次(在这种情况下为随机数生成)。这不仅会导致性能下降,还可能导致其他令人讨厌的错误。解决方案是保护整个主进程不被 windows 重复执行,如果在某处使用多处理:mptest.py

import random
import mptest as smt

if __name__ == '__main__':  
    l = []
    for i in range(4):
        l.append(random.randint(1,8))
    print "random numbers generated"   
    print smt.mp_factorizer(l, 4)

现在我们得到的只是想要的结果,随机数只生成一次:

Reading mptest.py file
random numbers generated
Reading mptest.py file
worker started
Reading mptest.py file
worker started
Reading mptest.py file
worker started
Reading mptest.py file
worker started
['1', '6', '2', '1']

请注意,在此示例中,mpteststart.py 是主进程。如果不是,if __name__ == '__main__'则必须向上移动调用链,直到它位于主进程中。一旦以这种方式保护了主进程,将不再有不需要的重复代码执行。

于 2013-01-30T10:46:44.743 回答
1

Windows 缺少os.fork. 所以在 Windows 上,多处理模块启动一个新的 Python 解释器并(重新)导入调用multiprocessing.Process.

使用的目的if __name__ == '__main__'是在重新导入脚本时保护调用multiprocessing.Process不被再次调用。(如果你不保护它,你就会得到一个叉子炸弹。)

如果您在重新导入脚本时不会multiprocessing.Process调用的类或函数中调用,那么就没有问题。继续照常使用。multiprocessing.Process

于 2013-01-29T13:37:05.500 回答