I have a directory containing n h5 file each of which has m image stacks to filter. For each image, I will run the filtering (gaussian and laplacian) using dask parallel arrays in order to speed up the processing (Ref to Dask). I will use the dask arrays through the apply_parallel() function in scikit-image.
I will run the processing on a small server with 20 cpus.
I would like to get an advice to which parallel strategy will make more sense to use:
1) Sequential processing of the h5 files and all the cpus for dask processing
2) Parallel processing of the h5 files with x cores and use the remaining 20-x to dask processing.
3) Distribute the resources and parallel processing the h5 files, the images in each h5 files and the remaining resources for dask.
thanks for the help!