不幸的是,没有办法优化您的任务。从截断分布生成随机点肯定有一些可能的优化......但事情是这样的:从随机分布生成 10^8 点左右会非常慢。
以下是我尝试的一些优化,它们加快了进程:
一次从 [a,b] 中的均匀分布生成所有随机数
回到截断分布定义的源头,而不依赖于“花哨”包(distr、distEx、truncdist)
编译我的代码以加快速度
代码:
# your original code, in a function
func = function()
{
library(distr)
library(distrEx)
library(truncdist)
set.seed(42)
shape.list <- runif(1000, max = 10, min = 0.01)
scale.list <- runif(1000, max = 100000, min = 100000)
mean.list <- list()
std.dev.list <- list()
ITE.NUMBER = 10
POINTS.NUMBER = 100000
A = 0.25
B = 0.5
for (i in seq(ITE.NUMBER)) # very slow
{
sample.points <- rtrunc(POINTS.NUMBER, spec="gamma", a = A, b = B,
shape = shape.list[[i]], scale = scale.list[[i]])
sample.mean <- mean(sample.points)
mean.list <- append(mean.list, sample.mean)
sample.std.dev <- sd(sample.points)
std.dev.list <- append(std.dev.list, sample.std.dev)
}
}
# custom code
func2 = function()
{
set.seed(42)
shape.list <- runif(1000, max = 10, min = 0.01)
scale.list <- runif(1000, max = 100000, min = 100000)
mean.list <- list()
std.dev.list <- list()
ITE.NUMBER = 10
POINTS.NUMBER = 100000
A=0.25
B=0.5
#
# we generate all the random number at once, outside the loop
#
r <- runif(POINTS.NUMBER*ITE.NUMBER, min = 0, max = 1)
for (i in seq(ITE.NUMBER)) # still very slow
{
#
# back to the definition of the truncated gamma
#
sample.points <- qgamma(pgamma(A, shape = shape.list[[i]], scale = scale.list[[i]]) +
r[(1+POINTS.NUMBER*(ITE.NUMBER-1)):(POINTS.NUMBER*(ITE.NUMBER))] *
(pgamma(B, shape = shape.list[[i]], scale = scale.list[[i]]) -
pgamma(A, shape = shape.list[[i]], scale = scale.list[[i]])),
shape = shape.list[[i]], scale = scale.list[[i]])
sample.mean <- mean(sample.points)
mean.list <- append(mean.list, sample.mean)
sample.std.dev <- sd(sample.points)
std.dev.list <- append(std.dev.list, sample.std.dev)
}
}
#
# maybe a compilation would help?
#
require(compiler)
func2_compiled <- cmpfun(func2)
require(microbenchmark)
microbenchmark(func2(), func2_compiled(), func(), times=10)
这给出了以下内容:
Unit: seconds
expr min lq median uq max neval
func2() 1.462768 1.465561 1.475692 1.489235 1.532693 10
func2_compiled() 1.403956 1.477983 1.487945 1.499133 1.515504 10
func() 1.457553 1.478829 1.502671 1.510276 1.513486 10
结论:
如前所述,几乎没有改进的余地:您的任务非常需要资源,并且没有什么可做的。
编译几乎让事情变得更糟......这是意料之中的:这里没有愚蠢地使用糟糕的编程技术(例如大丑陋的循环)
如果您真的在寻求速度改进,那么使用另一种语言可能会更好,尽管我怀疑您是否能够获得明显更好的性能..