使用
boot * 1.3-18 2016-02-23 CRAN (R 3.2.3)
data.table * 1.9.7 2015-10-05 Github (Rdatatable/data.table@d607425)
我使用 OP 的代码和@eddi 提供的答案收到错误:
data <- as.data.table(list(x1 = runif(200), x2 = runif(200), group = runif(200)>0.5))
stat <- function(x, i) {x[i, c(m1 = mean(x1), m2 = mean(x2)), by = "group"]}
data[, list(list(boot(.SD, stat, R = 10))), by = group]$V1
产生错误消息:
Error in eval(expr, envir, enclos) : object 'group' not found
by=group
通过从函数中删除来修复错误stat
:
set.seed(1000)
data <- as.data.table(list(x1 = runif(200), x2 = runif(200), group = runif(200)>0.5))
stat <- function(x, i) {x[i, c(m1 = mean(x1), m2 = mean(x2))]}
data[, list(list(boot(.SD, stat, R = 10))), by = group]$V1
这会产生以下引导统计结果:
[[1]]
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = .SD, statistic = stat, R = 10)
Bootstrap Statistics :
original bias std. error
t1* 0.5158232 0.004930451 0.01576641
t2* 0.5240713 -0.001851889 0.02851483
[[2]]
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = .SD, statistic = stat, R = 10)
Bootstrap Statistics :
original bias std. error
t1* 0.5142383 -0.0072475030 0.02568692
t2* 0.5291694 -0.0001509404 0.02378447
下面,我修改示例数据集以突出显示哪个 Bootstrap Statistic 与哪个组列组合一起使用:
考虑第 1 组,x1 的平均值为 10,x2 的平均值为 10000,第 2 组的平均值为 x1 的 2000,x2 的平均值为 8000:
data2 <- as.data.table(list(x1 = c(runif(100, 9,11),runif(100, 1999,2001)), x2 = c(runif(100, 9999,10001),runif(100, 7999,8001)), group = rep(c(1,2), each=100)))
stat <- function(x, i) {x[i, c(m1 = mean(x1), m2 = mean(x2))]}
data2[, list(list(boot(.SD, stat, R = 10))), by = group]$V1
这使:
[[1]]
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = .SD, statistic = stat, R = 10)
Bootstrap Statistics :
original bias std. error
t1* 10.00907 0.007115938 0.04349184
t2* 9999.90176 -0.019569568 0.06160653
[[2]]
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = .SD, statistic = stat, R = 10)
Bootstrap Statistics :
original bias std. error
t1* 1999.965 0.031694179 0.06561209
t2* 8000.110 -0.006569872 0.03992401