我知道你很久以前就问过这个问题,但我想我可能会与你分享一个解决方案for
,当你拥有的行数变得非常大时,它会比循环更有效率。在少量行中,速度差异可以忽略不计(for 循环甚至可能更快)。这仅依赖于子集和 rowSums 的使用,并且非常简单:
## For reproducibility
set.seed( 35471 )
## Example data - bigger than the original to get and idea of difference in speed
x<-matrix(rnorm(60),20,3)
y<-matrix(rnorm(300),100,3)
# My function which uses grid.expand to get all combinations of row indices, then rowSums to operate on them
rs <- function( x , y ){
rows <- expand.grid( 1:nrow(x) , 1:nrow(y) )
results <- matrix( rowSums( x[ rows[,1] , ] * y[ rows[,2] , ] ) , nrow(x) , nrow(y) )
return(results)
}
# Your orignal function
flp <- function(x ,y){
results<-matrix(NA,nrow(x),nrow(y))
for (i in 1:nrow(x)){
for (j in 1:nrow(y)){
r1<-x[i,]
r2<-y[j,]
results[i,j]<-sum(r1*r2) ## Example function
}
}
return(results)
}
## Benchmark timings:
library(microbenchmark)
microbenchmark( rs( x, y ) , flp( x ,y ) , times = 100L )
#Unit: microseconds
# expr min lq median uq max neval
# rs(x, y) 487.500 527.396 558.5425 620.486 679.98 100
# flp(x, y) 9253.385 9656.193 10008.0820 10430.663 11511.70 100
## And a subset of the results returned from each function to confirm they return the same thing!
flp(x,y)[1:3,1:3]
# [,1] [,2] [,3]
#[1,] -0.5528311 0.1095852 0.4461507
#[2,] -1.9495687 1.7814502 -0.3769874
#[3,] 1.8753978 -3.0908057 2.2341414
rs(x,y)[1:3,1:3]
# [,1] [,2] [,3]
#[1,] -0.5528311 0.1095852 0.4461507
#[2,] -1.9495687 1.7814502 -0.3769874
#[3,] 1.8753978 -3.0908057 2.2341414
所以你可以看到,rowSums
当行组合的数量只有 2000 时,通过使用和子集,我们可以比 for 循环快 20 倍。如果你有更多,速度上的差异会更大。
HTH。