r - R中网格上大型模拟的并行化

Question

我正在网格上运行一系列大型模拟。我正在按行执行模拟，我发现我的采样函数是一个瓶颈。我尝试使用 foreach 和 doMC 库来加速该过程，但我发现并行方法较慢，或者我无法编写将由 foreach 正确解释的函数。

查看其他一些帖子，看来我使用 foreach 的方法可能会被误导，因为我尝试的作业数量大大超过了可用处理器的数量。我想知道人们是否会对如何在我的情况下最好地实现并行化提出一些建议。我的模拟通常有两种类型。在第一个中，我计算一个矩阵，其中包含我正在处理的网格行中每个元素的采样间隔（行）。然后我使用 runif 进行采样（在实际模拟中，我的行包含 ~ 9000 个单元格，并且我正在执行 10000 个模拟）。

#number of simulations per element 
n = 5

#Generate an example sampling interval.
m.int1 <- matrix ( seq ( 1, 20, 1 ), ncol=10, nrow=2 )

#Define a function to sample over the interval defined in m.int1
f.rand1 <- function(a) {
return ( runif ( n, a[1], a[2] ) )
}

#run the simulation with each columns corresponding to the row element and rows 
#the simultions.
sim1 <- round( apply ( m.int1, 2, f.rand1 ) )

在第二种情况下，我试图从一组按矩阵中的列索引的经验分布中进行抽样。grid-row 元素的值对应于要采样的列。

#number of simulations per element 
n = 5

#generate a vector represeting a row of grid values 
v.int2 <- round(runif(10,1,3))

#define matrix of data that contains the distributions to be sampled.
m.samples<-cbind(rep(5,10),rep(4,10),rep(3,10))  

f.sample <- function(a) {
return ( sample ( m.samples [ ,a], n, ) )
}

#Sample m.samples indexed by column number.
sim2<- sapply(v.int2,f.sample)

在第二个示例中，我能够利用foreach()和%dopar%并行运行，但模拟花费的时间比串行代码长得多。在上面的第一个示例中，我无法编写适当的函数来利用 foreach 并行化。我将把我在第二种情况下使用的代码只是为了展示我的想法——但我现在意识到我的方法在开销上太昂贵了。

library(foreach)
library(doMC)
registerDoMC(2)

n = 5

#Sample m.samples indexed by column number using parallel method.
sim2.par <- foreach ( i = 1 : length ( v.int2 ), 
    .combine="cbind") %dopar% sample ( 
     m.samples [ , v.int2 [i] ] , n )

我很感激一些关于一种方法（和一些代码！）的建议，这将有助于我有效地利用并行化。同样，我正在处理的行通常包含大约 9000 个元素，我们对每个元素进行 10000 次模拟。所以我的输出模拟矩阵一般在 10000 X 9000 的量级。谢谢你的帮助。

score 1 · Accepted Answer

尝试使用它而不是两步过程。它跳过了这apply一步：

f.rand2 <- function(a) {
  matrix( runif ( n*ncol(a), rep(a[1,], n) , rep(a[2,], n) ), nrow=ncol(a) )
                    }

f.rand2(m.int1)
           [,1]      [,2]      [,3]      [,4]      [,5]
 [1,]  1.693183  1.404336  1.067888  1.904476  1.161198
 [2,]  3.411118  3.852238  3.621822  3.969399  3.318809
 [3,]  5.966934  5.466153  5.624387  5.646181  5.347473
 [4,]  7.317181  7.106791  7.403022  7.442060  7.161711
 [5,]  9.491231  9.656023  9.518498  9.569379  9.812931
 [6,] 11.843074 11.594308 11.706276 11.744094 11.994256
 [7,] 13.375382 13.599407 13.416135 13.634053 13.539246
 [8,] 15.948597 15.532356 15.692132 15.442519 15.627716
 [9,] 17.856878 17.208313 17.804288 17.875288 17.232867
[10,] 19.214776 19.689534 19.732680 19.813718 19.866297

对我来说，它把时间缩短了一半：

> system.time(x1 <- replicate(n, round(apply(m.int1, 2, f.rand1))))
   user  system elapsed 
  1.088   0.470   1.550 

> system.time(x1 <- replicate(n, f.rand2(m.int1)))
   user  system elapsed 
  0.559   0.256   0.811

score 1 · Accepted Answer

这是您的第一次模拟的轻微改进。更大n可能会在运行时产生更大的收益。

> n <- 1000
> m.int1 <- matrix ( seq ( 1, 20, 1 ), ncol=10, nrow=2 )
> f.rand1 <- function(a) {
+    return(runif(n, a[1], a[2]))
+ }
> system.time(x1 <- replicate(n, round(apply(m.int1, 2, f.rand1))))
   user  system elapsed 
   2.84    0.06    2.95 
> system.time(x2 <- replicate(n, matrix(round(runif(n*10, min = m.int1[1, ], max = m.int1[2, ])), ncol = 10, byrow = TRUE)))
   user  system elapsed 
   2.48    0.06    2.61 
> head(x1[,,1])
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    1    4    5    7   10   12   13   16   17    20
[2,]    1    3    6    7   10   11   13   16   17    19
[3,]    1    3    6    7   10   12   14   16   18    20
[4,]    2    4    5    7    9   12   14   16   17    19
[5,]    1    4    5    7   10   12   14   16   17    20
[6,]    1    4    6    8    9   11   13   15   18    20
> head(x2[,,1])
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]    2    4    6    7    9   12   14   16   17    20
[2,]    1    3    6    8   10   12   14   15   18    20
[3,]    2    4    5    7    9   11   13   15   17    20
[4,]    2    3    5    7    9   11   14   15   17    19
[5,]    2    3    6    7    9   12   13   16   17    20
[6,]    2    4    6    7   10   12   14   16   17    20

r - R中网格上大型模拟的并行化

2 回答 2

Related

Reference