3

I am fairly new to python and I am currently looking at multiprocessing. I have created a simple example that I assumed would be considerably quicker to do with multiprocessing than single processing, but as it turns out it is actually slower! The script creates and runs through a list with integers from 0 to 999, splitting it into shorter lists that worker processes then runs through and prints "I am worker [integer]". Typical run time is appr. 26 sec, while the single process script is .5-1 sec faster. Is there any particular reason why my multiprocessing script is slower? Or why it is a bad example to use for multiprocessing? The code for the two scripts are below for reference

Multiprocessing code:

import multiprocessing
from datetime import datetime

def f(x):
    listagain=[]
    for i in x:
        listagain.append("I am worker " + str(i))
    return listagain    

def chunks(l, n):
    """ Yield successive n-sized chunks from l.
    """
    lister=[]
    for i in xrange(0, len(l), n):
        lister.append(l[i:i+n])
    return lister

if __name__ == '__main__':
    startTime=datetime.now()
    Pool=multiprocessing.Pool
    mylist=list(xrange(10000))
    size=10
    listlist=[]
    listlist=chunks(mylist,size)
    workers=4
    pool=Pool(processes=workers)
    result=pool.map(f,listlist)
    pool.close()
    pool.join()
    print result
    print (datetime.now()-startTime)

Single processing code:

from datetime import datetime

def f(x):
    listagain=[]
    for i in x:
        for j in xrange(0,len(i)):
            listagain.append("I am worker " + str(i[j]))
    return listagain 

def chunks(l, n):
    """ Yield successive n-sized chunks from l.
    """
    lister=[]
    for i in xrange(0, len(l), n):
        lister.append(l[i:i+n])
    return lister

if __name__ == '__main__':
    startTime=datetime.now()
    mylist=list(xrange(10000))
    size=10
    listlist=[]
    listlist=chunks(mylist,size)
    result=f(listlist)
    print result
    print (datetime.now()-startTime)
4

1 回答 1

3

与之相关的开销multiprocessing可能高于问题中单个任务所消耗的时间,但如果你有更大的任务,这个开销(通常与酸洗 Python 对象相关)将成比例地变小,并且使用多处理将是有利的。

于 2013-10-02T20:18:22.557 回答