I have a directory containing n
h5 file each of which has m
image stacks to filter. For each image, I will run the filtering (gaussian and laplacian) using dask
parallel arrays in order to speed up the processing (Ref to Dask). I will use the dask arrays through the apply_parallel()
function in scikit-image.
I will run the processing on a small server with 20 cpus
.
I would like to get an advice to which parallel strategy will make more sense to use:
1) Sequential processing of the h5 files and all the cpus for dask processing
2) Parallel processing of the h5 files with x
cores and use the remaining 20-x
to dask processing.
3) Distribute the resources and parallel processing the h5 files, the images in each h5 files and the remaining resources for dask.
thanks for the help!