我正在使用bigmemory和biganalytics包,并专门尝试计算big.matrix
对象的平均值。biganalytics 的文档(例如?biganalytics
)建议mean()
应该可用于big.matrix
对象,但这失败了:
x <- big.matrix(5, 2, type="integer", init=0,
+ dimnames=list(NULL, c("alpha", "beta")))
x
# An object of class "big.matrix"
# Slot "address":
# <pointer: 0x00000000069a5200>
x[,1] <- 1:5
x[,]
# alpha beta
# [1,] 1 0
# [2,] 2 0
# [3,] 3 0
# [4,] 4 0
# [5,] 5 0
mean(x)
# [1] NA
# Warning message:
# In mean.default(x) : argument is not numeric or logical: returning NA
虽然有些事情可以正常工作:
colmean(x)
# alpha beta
# 3 0
sum(x)
# [1] 15
mean(x[])
# [1] 1.5
mean(colmean(x))
# [1] 1.5
没有mean()
,似乎mean(colmean(x))
是下一个最好的事情:
# try it on something bigger
x = big.matrix(nrow=10000, ncol=10000, type="integer")
x[] <- c(1:(10000*10000))
mean(colmean(x))
# [1] 5e+07
mean(x[])
# [1] 5e+07
system.time(mean(colmean(x)))
# user system elapsed
# 0.19 0.00 0.19
system.time(mean(x[]))
# user system elapsed
# 0.28 0.11 0.39
大概mean()
还可以更快,尤其是对于具有大量列的矩形矩阵。
任何想法为什么mean()
不适合我?