I am fairly new to python and I am currently looking at multiprocessing. I have created a simple example that I assumed would be considerably quicker to do with multiprocessing than single processing, but as it turns out it is actually slower! The script creates and runs through a list with integers from 0 to 999, splitting it into shorter lists that worker processes then runs through and prints "I am worker [integer]". Typical run time is appr. 26 sec, while the single process script is .5-1 sec faster. Is there any particular reason why my multiprocessing script is slower? Or why it is a bad example to use for multiprocessing? The code for the two scripts are below for reference
Multiprocessing code:
import multiprocessing
from datetime import datetime
def f(x):
listagain=[]
for i in x:
listagain.append("I am worker " + str(i))
return listagain
def chunks(l, n):
""" Yield successive n-sized chunks from l.
"""
lister=[]
for i in xrange(0, len(l), n):
lister.append(l[i:i+n])
return lister
if __name__ == '__main__':
startTime=datetime.now()
Pool=multiprocessing.Pool
mylist=list(xrange(10000))
size=10
listlist=[]
listlist=chunks(mylist,size)
workers=4
pool=Pool(processes=workers)
result=pool.map(f,listlist)
pool.close()
pool.join()
print result
print (datetime.now()-startTime)
Single processing code:
from datetime import datetime
def f(x):
listagain=[]
for i in x:
for j in xrange(0,len(i)):
listagain.append("I am worker " + str(i[j]))
return listagain
def chunks(l, n):
""" Yield successive n-sized chunks from l.
"""
lister=[]
for i in xrange(0, len(l), n):
lister.append(l[i:i+n])
return lister
if __name__ == '__main__':
startTime=datetime.now()
mylist=list(xrange(10000))
size=10
listlist=[]
listlist=chunks(mylist,size)
result=f(listlist)
print result
print (datetime.now()-startTime)