5

I'm trying to process a bunch of csv files and return data frames in R, in parallel using mclapply(). I have a 64 core machine, and I can't seem to get anymore that 1 core utilized at the moment using mclapply(). In fact, it is a bit quicker to run lapply() rather than mclapply() at the moment. Here is an example that shows that mclapply() is not utilizing more the cores available:

library(parallel)

test <- lapply(1:100,function(x) rnorm(10000))
system.time(x <- lapply(test,function(x) loess.smooth(x,x)))
system.time(x <- mclapply(test,function(x) loess.smooth(x,x), mc.cores=32))

user  system elapsed
  0.000   0.000   7.234
user  system elapsed
  0.000   0.000   8.612

Is there some trick to getting this working? I had to compile R from source on this machine (v3.0.1), are there some compile flags that I missed to allow forking? detectCores() tells me that I indeed do have 64 cores to play with... Any tips appreciated!

4

1 回答 1

6

我得到与你相似的结果,但如果我rnorm(10000)改为rnorm(100000),我会得到显着的加速。我猜想额外的开销会抵消如此小规模问题的任何性能优势。

于 2014-04-11T19:34:39.840 回答