r - R中的矢量化顺序

Question

我正在尝试对矩阵中的每一行进行排序，其中列数少，行数多。R中是否有这个的矢量化版本？更具体地说，让我们将种子设置为 10 并制作一个示例矩阵：

set.seed(10)
example.matrix = replicate(12,runif(500000))

要订购 example.matrix，我会，

ordered.example = apply(example.matrix,1,order)

但这很慢，我会喜欢更快的东西。打个比方，

rowSums(example.matrix)

更可取的是，

apply(example.matrix,1,sum)

非常感激。

score 3 · Accepted Answer

这有点快（关键是order(row(em), em)）：

set.seed(10)
em <- replicate(12,runif(500000))
system.time(a <- matrix(em[order(row(em), em)], nrow=nrow(em), byrow=TRUE))
#    user  system elapsed 
# 5.36    0.12    5.80 

set.seed(10)
example.matrix <- replicate(12,runif(500000))
system.time(ordered.example <- apply(example.matrix,1,order))
#    user  system elapsed 
#   13.36    0.09   15.52 

identical(a, ordered.example)
# [1] FALSE

score 3 · Accepted Answer

这是一种将其加速 10 倍的方法。它专门针对您的示例量身定制，具体取决于您的真实数据是什么样的，此方法可能有效，也可能无效。

这个想法是将 0 添加到第一行，将 1 添加到第二行，依此类推，然后将其折叠为一个向量，对其进行排序，然后重新组合成一个矩阵：

N = 12; M = 500000; d = replicate(N,runif(M))

system.time(d1<-t(apply(d, 1, order)))
#   user  system elapsed 
#  11.26    0.06   11.34 

system.time(d2<-matrix(order(as.vector(t(matrix(as.vector(d) + 0:(M-1), nrow = M)))) -
                       rep(0:(M-1), each = N)*N, nrow = M, byrow = T))
#   user  system elapsed 
#   1.39    0.14    1.53 

# Note: for some reason identical() fails, but the two are in fact the same
sum(abs(d1-d2))
# 0

r - R中的矢量化顺序

2 回答 2

Related

Reference