0

我试图用三个协变量(X、Y、Z)和两个断点估计变量 V 的断点。

响应变量 V = aX + bY + cZ + d

我模拟数据,其中 (a,b,c,d) 有 3 组值,分别为 (0.6,0.2,0.8,0.15)、(1.6,1.2,1.8,1.15) 和 (3,5,4,2.5)

我使用分段包来估计系数,但得到以下错误:

Error in segmented.lm(linearFit, seg.Z = ~X + Y + Z, psi = list(X = c(NA),  :   

Bootstrap restart only with a fixed number of breakpoints

这是我的代码,带有数据

    #trapezoidal data    
    ref=c(rep(1,100),seq(1,10,0.05),rep(10,150),seq(10,0,-0.05),rep(0,200))

    #covariates
    xx=cumsum(ref) 
    yy=diff(xx)
    zz=diff(yy)

    #equalizing lengths of above vectors
    vecL=length(zz)
    xx=xx[1:vecL]
    yy=yy[1:vecL]
    zz=zz[1:vecL]

    #adding noise to covariates
    set.seed(10)
    X=xx + max(xx)/100*rnorm(vecL)
    Y=yy + max(yy)/100*rnorm(vecL)
    Z=zz + max(zz)/100*rnorm(vecL)

    #three segment response variable, total 830 points
    V[1:200]   = 0.6 *X[1:200]+   0.2 *Y[1:200]+   0.8 *Z[1:200]+   0.15 + 0.01*rnorm(200)
    V[201:400] = 1.6 *X[201:400]+ 1.2 *Y[201:400]+ 1.8 *Z[201:400]+ 1.15 + 0.01*rnorm(200)
    V[401:830] = 3.0 *X[401:830]+ 5.0 *Y[401:830]+ 4.0 *Z[401:830]+ 2.50 + 0.01*rnorm(430)

    ##linear model

    linearFit=lm(formula=V~X+Y+Z)
    summary(linearFit)


    ##segmented 

    segFit=segmented(linearFit,seg.Z=~X+Y+Z,psi=list(X=c(NA),Y=c(NA),Z=c(NA)),control=seg.control(display=TRUE, K=4, stop.if.error=FALSE))

这是输出:

segFit=segmented(linearFit,seg.Z=~X+Y+Z,psi=list(X=c(NA),Y=c(NA),Z=c(NA)),control=seg.control(display=TRUE, K=4, stop.if.error=FALSE))
Error in segmented.lm(linearFit, seg.Z = ~X + Y + Z, psi = list(X = c(NA),  : 
  Bootstrap restart only with a fixed number of breakpoints

我是否正确设置了 psi 和控制?任何帮助表示赞赏。

4

1 回答 1

0

自动断点检测似乎是非常实验性的,文档表明了这一点。最好提供有限数量的起始值。但无论如何,我可以让拟合函数开始像这样运行:

segFit=segmented(linearFit,seg.Z=~X+Y+Z,psi=list(X=c(NA),Y=c(NA),Z=c(NA)),
                 control=seg.control(display=TRUE, K=4, stop.if.error=FALSE, n.boot=0, it.max=50))
#0   287035116.259  (No breakpoint(s)) 
#1   52847700.113  12 
#2   66421579.610  7 
#3   60143023.830  7 
#4   55936266.042  7 
#5   45478319.984  5 
#6   37237514.620  5 
#7   34058342.767  5 
#8   33889551.970  3 
#9   33679837.419  3 
#10  33680392.183  3 
#Error in eval(expr, envir, enclos) : object 'U1.Y' not found

它给我们带来了一个错误。我的解释是没有Y找到断点。因此,我将其从断点公式中删除:

segFit=segmented(linearFit,seg.Z=~X+Z,psi=list(X=c(NA),Z=c(NA)),
                 control=seg.control(display=TRUE, K=4, stop.if.error=FALSE, n.boot=0, it.max=50))
#0   287035116.259  (No breakpoint(s)) 
#1   57518175.693  8 
#2   75024714.551  4 
#3   53678468.904  4 
#4   42978477.989  4 
#5   36762393.424  4 
#6   34564133.079  4 
#7   33672729.061  4 
#8   33672705.918  4 
#Error in eval(expr, envir, enclos) : object 'U1.Z' not found

它仍然不喜欢它。让我们删除Z

segFit=segmented(linearFit,seg.Z=~X,psi=list(X=c(NA)),
                 control=seg.control(display=TRUE, K=4, stop.if.error=FALSE, n.boot=0, it.max=50))
#0   287035116.259  (No breakpoint(s)) 
#1   59188023.560  4 
#2   84927431.755  3 
#3   58905175.574  3 
#4   46487759.098  3 
#5   39114874.784  3 
#6   34916433.946  3 
#7   33986478.337  3 
#8   33680464.097  3 
#9   33680464.097  3 

成功!(我不确定segmented能否很好地处理多个变量的中断。)

于 2014-09-17T17:54:14.167 回答