0

我只是真的开始尝试在 R 中编写 MLE 命令,该函数看起来类似于原生 R 函数。在这次尝试中,我试图做一个简单的 MLE

y=b0 + x*b1 + u

u~N(0,sd=s0 + z*s1)

但是,即使是这样一个简单的命令,我也很难编码。我在Stata 中用几行写了一个类似的命令

这是我到目前为止用 R 编写的代码。

  normalreg <- function (beta, sigma=NULL, data, beta0=NULL, sigma0=NULL,
                         con1 = T, con2 = T) {

    # If a formula for sigma is not specified 
    #  assume it is the same as the formula for the beta.
    if (is.null(sigma)) sigma=beta

    # Grab the call expression
    mf <- match.call(expand.dots = FALSE)

    # Find the position of each argument
    m <- match(c("beta", "sigma", "data", "subset", "weights", "na.action", 
                 "offset"), names(mf), 0L)

    # Adjust names of mf
    mf <- mf[c(1L, m)]

    # Since I have two formulas I will call them both formula
    names(mf)[2:3] <- "formula"

    # Drop unused levels
    mf$drop.unused.levels <- TRUE

    # Divide mf into data1 and data2
    data1  <- data2 <- mf
     data1 <- mf[-3]
     data2 <- mf[-2]

    # Name the first elements model.frame which will be 
    data1[[1L]] <- data2[[1L]] <- as.name("model.frame")

    data1 <- as.matrix(eval(data1, parent.frame()))
    data2 <- as.matrix(eval(data2, parent.frame()))

    y     <- data1[,1]
    data1 <- data1[,-1]
     if (con1)  data1 <- cbind(data1,1)
    data2 <- unlist(data2[,-1])
      if (con2) data2 <- cbind(data2,1)

    data1 <- as.matrix(data1) # Ensure our data is read as matrix
    data2 <- as.matrix(data2) # Ensure our data is read as matrix

    if (!is.null(beta0)) if (length(beta0)!=ncol(data1))
      stop("Length of beta0 need equal the number of ind. data2iables in the first equation")

    if (!is.null(sigma0)) if (length(sigma0)!=ncol(data2)) 
      stop("Length of beta0 need equal the number of ind. data2iables in the second equation")

    # Set initial parameter estimates
    if (is.null(beta0))  beta0   <- rep(1, ncol(data1))
    if (is.null(sigma0)) sigma0 <- rep(1, ncol(data2))

    # Define the maximization function
    normMLE <- function(est=c(beta0,sigma0), data1=data1, data2=data2, y=y) {          
      data1est <- as.matrix(est[1:ncol(data1)], nrow=ncol(data1))
      data2est <- as.matrix(est[(ncol(data1)+1):(ncol(data1)+ncol(data2))],
                              nrow=ncol(data1))

      ps <-pnorm(y-data1%*%data1est, 
                       sd=data2%*%data2est)
      # Estimate a vector of log likelihoods based on coefficient estimates
      llk <- log(ps)
      -sum(llk) 
    }

    results <- optim(c(beta0,sigma0), normMLE, hessian=T,
                     data1=data1, data2=data2, y=y)

    results
  }


  x <-rnorm(10000)
  z<-x^2
  y <-x*2 + rnorm(10000, sd=2+z*2) + 10

  normalreg(y~x, y~z)

在这一点上,最大的问题是找到一个优化例程,当标准偏差为负时,一些值返回 NA 时不会失败。有什么建议么?抱歉,代码量很大。

弗朗西斯

4

1 回答 1

2

我包括检查是否有任何标准偏差小于或等于 0,如果是这种情况,则返回 0 的可能性。似乎对我有用。您可以弄清楚将其包装到您的函数中的细节。

#y=b0 + x*b1 + u
#u~N(0,sd=s0 + z*s1)

ll <- function(par, x, z, y){
    b0 <- par[1]
    b1 <- par[2]
    s0 <- par[3]
    s1 <- par[4]
    sds <- s0 + z*s1
    if(any(sds <= 0)){
        return(log(0))
    }

    preds <- b0 + x*b1

    sum(dnorm(y, preds, sds, log = TRUE))
}

n <- 100
b0 <- 10
b1 <- 2
s0 <- 2
s1 <- 2
x <- rnorm(n)
z <- x^2
y <- b0 + b1*x + rnorm(n, sd = s0 + s1*z)

optim(c(1,1,1,1), ll, x=x, z=z,y=y, control = list(fnscale = -1))

话虽如此,以不可能变为负数的方式参数化标准偏差可能不是一个坏主意......

于 2013-08-25T17:39:57.193 回答