r - 当采样集中的所有值都相等时，避免 R boot.ci 函数中的错误

Question

我有许多作为函数输入的数据集。数据存储在数据表中，我正在计算函数输出的置信区间。但是，在某些情况下，所有输入数据都相同，从而导致错误：“x 的所有值都等于 100 \n 无法计算置信区间”如何避免此错误（例如，只需设置置信度对于所有值都相等的情况，将间隔设置为任意值，例如 0 或 NA）？例如：

library(boot)
library(data.table)

problem=1

data<-data.table(column1=c(1:100),column2=c(rep(100,99),problem))
resample.number=1000
confidence=0.95

sample.mean<-function(indata,x){mean(indata[x])}

boot_obj<-lapply(data,boot,statistic = sample.mean,R = resample.number)

boot.mean.f<-function(x,column){
    x[column][1]
}

means<-data.table(sapply(boot_obj,boot.mean.f))
bootci_obj<-lapply(boot_obj,boot.ci, conf = confidence, type = "perc")
bootci.f<-function(x,column){
    x<-x[column][4]
    x<-unlist(strsplit(as.character(x[1]),","))
    x<-sub("[:punct:].*","",x)
    x<-sub("lis.*","",x)
    x<-sub(").?","",x)
    x<-na.omit(as.numeric(x))
}

cis<-data.table(t(sapply(bootci_obj,bootci.f)))
setnames(means,"V1","stat")

cis[,V1:=NULL]
cis[,V2:=NULL]
setnames(cis,c("V3","V4"),c("lci","uci"))

return(cbind(means,cis))

返回：

stat      lci       uci
1:  50.5 44.96025  56.26797
2: 99.01 97.03000 100.00000

改变

problem=1

返回：“t 的所有值都等于 100 \n 无法计算置信区间”，这会导致其他错误。

我希望结果是：

stat      lci       uci
1:  50.5 44.96025  56.26797
2: 100.0 0.0000 0.00000

score 9 · Accepted Answer

我堆叠了 data.table，因为使用长格式的 data.table 效率更高。如果所有值都相等，我也更喜欢将置信限设置为与平均值相同的值。随意调整。

library(boot)
library(data.table)

DT <- data.table(column1=1:100,column2=rep(100,100))
DT <- data.table(stack(DT))

resample.number=1000
confidence=0.95

sample.mean <- function(indata,x){mean(indata[x])}
ci.mean <- function(x, resample.number,confidence) {
  if(length(unique(x)) > 1) {
    temp <- boot.ci(boot(x,statistic = sample.mean,R = resample.number), conf = confidence, type = "perc")$percent
    list(mean=mean(x),lwr=temp[,4],upr=temp[,5])
  } else {
    list(mean=mean(x),lwr=mean(x),upr=mean(x)  
  }
}

set.seed(42)
DT[,ci.mean(values,resample.number,confidence),by=ind]

#       ind  mean       lwr       upr
#1: column1  50.5  44.92305  55.93949
#2: column2 100.0 100.00000 100.00000

请注意，如果所有值都相等，则boot.ci只会发出警告并返回值。NA没有错误，如果您可以使用 NA，则不需要该if条件。

r - 当采样集中的所有值都相等时，避免 R boot.ci 函数中的错误

1 回答 1

Related

Reference