0

以下是head我的数据:

dput(head(trucksv[,c(1,5)]))
structure(list(Measur. = c(1L, 2L, 3L, 4L, 5L, 1L), Speed.Mean.Trucks = c(NA, 
NA, 9.5, 4.5, NA, NA)), .Names = c("Measur.", "Speed.Mean.Trucks"
), row.names = c(1L, 2L, 3L, 4L, 5L, 17L), class = "data.frame")

我想通过“测量”找到速度的累积分布。为此我使用了以下功能:

f <- function(x) {
  hi <- hist(x)
  speedmph=round(hi$breaks*0.68,1)
  prob=c(0, round(cumsum(hi$counts)/sum(hi$counts),digits=2))
  cbind(speedmph, prob)
}

但是当我尝试将它应用于我的数据时,R 给了我以下错误:

tspdistu <- ddply(trucksv, 'Measur.', summarise, trucksspeedmph = f(Speed.Mean.Trucks)) 
Error in hist.default(x) : invalid number of 'breaks'
Called from: top level 
Browse[1]> 

我不确定如何找到正确数量的垃圾箱。请帮忙。提前致谢。

4

1 回答 1

1

' NAs 正在抛弃它(即它与# of bins 无关)。这是一个稍微修改f()的,既禁用了绘图hist(你不太可能想要绘图),又处理了一个列子集 all NA's

f <- function(x) {

  y <- x[!is.na(x)]

  if (length(y) > 0) {

    hi <- hist(x, plot=FALSE)

    speedmph <- round(hi$breaks*0.68,1)

    prob <- c(0, round(cumsum(hi$counts) / sum(hi$counts), digits=2))

    cbind(speedmph, prob)

  } else { # still need to return proper sized values 

    cbind(rep(NA, length(x)), rep(NA, length(x)))

  }

}
于 2014-03-23T02:22:16.883 回答