I am new to the IPython parallel package but really want to get it going. What I have is a 4D numpy array which I want to run through slices,rows,columns and process the 4th dimension (time). The processing is a minimization routine that takes a bit of time which is why I would like to parallelize it.
from IPython.parallel import Client
from numpy import *
from matplotlib.pylab import *
c = Client()
v = c.load_balanced_view()
v.block=False
def process( src, freq, d ):
# Get slice, row, col
sl,r,c = src
# Get data
mm = d[:,sl,c,r]
# Call fitting routine
<fiting routine that requires freq, mm and outputs multiple parameters>
return <output parameters??>
## Create the mask of what we are going to process
mask = zeros(d[0].shape)
mask[sl][ nonzero( d[0,sl] > 10*median(d[0]) ) ] = 1
# find all non-zero points in the mask
points = array(nonzero( mask == 1)).transpose()
# Call async
asyncresult = v.map_async( process, points, freq=freq, d=d )
My function "process" requires two parameters: 1) freq is a numpy array (100,1) and 2) d which is (100, 50, 110, 110) or so. I want to retrieve several parameters from the fitting.
All the examples I have seen that use map_async have simple lambda functions etc and the outputs seem to be trivial.
What I want is to apply "process" to every point in d where the mask is not zero and to have maps of the output parameters in the same space. [Added: I am getting "process() takes exactly 3 arguments (1 given) ].
(Step 2 of this might be required as I am passing a huge numpy array "d" to each process. But once I figure out the data passing I should hopefully be able to figure out a more efficient way of doing this.)
Thanks for any help.