0

当我在没有注释的情况下运行以下代码时,gr.ascent(MMSE, 0.5, verbose=TRUE)我收到此错误Error in b1 * x : 'b1' is missing,但是当我评论该行时,我在使用这些参数测试 MMSE 时收到以下错误MMSE(2,1,farmland$farm,farmland$area)。你知道我的问题出在哪里吗?

Error in if (abs(t[i]) <= k) { : missing value where TRUE/FALSE needed

这是我的代码:

farmland <- read.csv("FarmLandArea.csv")
str(farmland)
fit=lm(farm~land,data=farmland)
mean.squared.residuals <- sum((lm(farm~land,data=farmland)$residuals)^2)/(length(farmland$farm)-2)

#gradient descent

#things I should possibly use: solve(t(X)%*%X, t(X)%*%y)
gr.ascent<- function(df, x0, alpha=0.2, eps=0.001, max.it = 50, verbose = FALSE){
  X1 <- x0
  cond <- TRUE
  iteration <- 0
  if(verbose) cat("X0 =",X1,"\n")
  while(cond){
    iteration <- iteration + 1
    X0 <- X1
    X1 <- X0 + alpha * df(X0)
    cond <- sum((X1 - X0)^2) > eps & iteration < max.it
    if(verbose) cat(paste(sep="","X",iteration," ="), X1, "\n")
  }
  return(X1)
}


k=19000

#rho <- function(t, k=19000){
#  for (i in seq(1,length(t))){
#    if (abs(t[i]) <= k)
#      return(t[i]^2)
#    else 
#      return(2*k*abs(t[i])-k^2)

#  }

#}

#nicer implementation of rho. ifelse works on vector
rho<-function(t,k) ifelse(abs(t)<=k,t^2,(2*k*abs(t))-k^2)
rho.prime <- function(t, k=19000){
  out <- rep(NA, length(t))
  for (i in seq(1,length(t))){
    if (abs(t[i]) <= k)
    { print(2*t[i])
      out[i] <- 2*t[i] 
    }
    else 
    {
      print(2*k*sign(t[i]))
      out[i] <- 2*k*sign(t[i])
    }
  }
  return(out)
}
MMSE <- function(b0, b1, y=farmland$farm, x=farmland$land){
   # Calls rho.prime() here with argument y-b0-b1*x


   #Why should we call rho.prime? in the html page you have used rho!?
  n = length(y)
  total = 0
  for (i in seq(1,n)) {
    #total = total + rho(t,k)*(y[i]-b0-b1*x[i])
    total = total + rho.prime(y-b0-b1*x,k)*(y[i]-b0-b1*x[i])
  }
  return(total/n)
}

gr.ascent(MMSE(1,2), 0.5, verbose=TRUE)

其中 FarmLand csv 数据如下:

state,land,farm
Alabama,50744,14062
Alaska,567400,1375
Arizona,113635,40781
Arkansas,52068,21406
California,155959,39688
Colorado,103718,48750
Connecticut,4845,625
Delaware,1954,766
Florida,53927,14453
Georgia,57906,16094
Hawaii,6423,1734
Idaho,82747,17812
Illinois,55584,41719
Indiana,35867,23125
Iowa,55869,48125
Kansas,81815,72188
Kentucky,39728,21875
Louisiana,43562,12578
Maine,30862,2109
Maryland,9774,3203
Massachusetts,7840,812
Michigan,58110,15625
Minnesota,79610,42031
Mississippi,46907,17422
...

这是 dput(farmland) 的结果:

> dput(farmland)
structure(list(state = structure(1:50, .Label = c("Alabama", 
"Alaska", "Arizona", "Arkansas", "California", "Colorado", "Connecticut", 
"Delaware", "Florida", "Georgia", "Hawaii", "Idaho", "Illinois", 
"Indiana", "Iowa", "Kansas", "Kentucky", "Louisiana", "Maine", 
"Maryland", "Massachusetts", "Michigan", "Minnesota", "Mississippi", 
"Missouri", "Montana", "Nebraska", "Nevada", "New Hampshire", 
"New Jersey", "New Mexico", "New York", "North Carolina", "North Dakota", 
"Ohio", "Oklahoma", "Oregon", "Pennsylvania", "Rhode Island", 
"South Carolina", "South Dakota", "Tennessee", "Texas", "Utah", 
"Vermont", "Virginia", "Washington", "West Virginia", "Wisconsin", 
"Wyoming"), class = "factor"), land = c(50744L, 567400L, 113635L, 
52068L, 155959L, 103718L, 4845L, 1954L, 53927L, 57906L, 6423L, 
82747L, 55584L, 35867L, 55869L, 81815L, 39728L, 43562L, 30862L, 
9774L, 7840L, 58110L, 79610L, 46907L, 68886L, 145552L, 76872L, 
109826L, 8968L, 7417L, 121356L, 47214L, 48711L, 68976L, 40948L, 
68667L, 95997L, 44817L, 1045L, 30109L, 75885L, 41217L, 261797L, 
82144L, 9250L, 39594L, 66544L, 24230L, 54310L, 97105L), farm = c(14062L, 
1375L, 40781L, 21406L, 39688L, 48750L, 625L, 766L, 14453L, 16094L, 
1734L, 17812L, 41719L, 23125L, 48125L, 72188L, 21875L, 12578L, 
2109L, 3203L, 812L, 15625L, 42031L, 17422L, 45469L, 95000L, 71250L, 
9219L, 734L, 1141L, 67500L, 10938L, 13438L, 61875L, 21406L, 55000L, 
25625L, 12109L, 109L, 7656L, 68281L, 17031L, 203750L, 17344L, 
1906L, 12578L, 23125L, 5703L, 23750L, 47188L)), .Names = c("state", 
"land", "farm"), class = "data.frame", row.names = c(NA, -50L
))
4

1 回答 1

2

好的,通过数字:

  1. 在您对您的调用中,传递gr.ascent(...)一个函数MMSE作为第一个参数。在里面gr.ascent(...)你把这个函数称为df(...).
  2. 该函数MMSE(...)有 2 个参数,b0并且b1没有默认值 - 因此必须指定这些参数,否则会出错,但是
  3. 当您在df(...)内部调用函数时gr.ascent(...),在以下行中:X1 <- X0 + alpha * df(X0)您仅传递 1 个参数,即b0.
  4. 所以第二个参数b1丢失了,因此出现了错误。

当您MMSE(...)直接调用时,如:

MMSE(2,1,farmland$farm,farmland$area)

farmland$area你作为第四个参数传递。但是数据框中没有列areafarmland!所以 this 被传递为NA, which, 当用于

total = total + rho.prime(y-b0-b1*x,k)*(y[i]-b0-b1*x[i])

t参数强制为rho.prime(...)to NA,因此出现第二个错误。

我无法提出解决方案,因为我不知道您要在这里完成什么。

编辑(对 OP 评论的回应)。

尽管我完全同意@thelatemail 的评论,但您的新错误相当模糊。

在您的早期版本中,您将functionMSEE(...)to传递给gr.ascent(...)并错误地使用它。这一次,您将一个传递给gr.ascent(...),该值是您调用时的返回值MSEE(1,2)。那么当您尝试将此视为一个函数时会发生什么,如下所示:

X1 <- X0 + alpha * df(X0)

好吧,通常这会抛出一个错误,告诉你这df不是一个函数。df 在这种情况下,只是你的运气不好。它是 F 分布的概率密度函数,它有一个必需的参数df1等(类型?df以查看文档)。这就是您收到错误的原因。

要“修复”这个问题,您需要返回传递函数,如下所示:

gr.ascent(MSEE,...)

然后在里面正确使用它gr.ascent(...),如:

X1 <- X0 + alpha * df(X0, <some other argument>).
于 2014-04-22T04:22:59.303 回答