r - 使用 fitdist 函数（fitdistrplus 包）估计比例和形状参数的缩放

Question

如标题中所述，我fitdist在 R（fitdistrplus包）中的函数存在缩放问题。

请看下面的代码：

# Initialize arrays for storing result
fit_store_scale <- rep(NA, 3)
fit_store_shape <- rep(NA, 3)

# load data
data1 <- c(7.616593e-05, 5.313253e-05, 1.604328e-04, 6.482365e-05,
           4.217499e-05, 6.759114e-05, 3.531301e-05, 1.934228e-05,
           6.263665e-05, 8.796205e-06)
data2 <- c(7.616593e-06, 5.313253e-06, 1.604328e-05, 6.482365e-06,
           4.217499e-06, 6.759114e-06, 3.531301e-06, 1.934228e-06,
           6.263665e-06, 8.796205e-07)
data3 <- c(7.616593e-07, 5.313253e-07, 1.604328e-06, 6.482365e-07,
           4.217499e-07, 6.759114e-07, 3.531301e-07, 1.934228e-07,
           6.263665e-07, 8.796205e-08)
# form data frame
data <- data.frame(data1, data2, data3)

# set scaling factor
scaling <- 1        #works without warnings and errors at:    
                    #10000 (data1), 100000 (data2) or
                    #1000000 (data3)

# store scale and shape parameter of data1, data2 and data3 in Array
for(i in 1:3)
{
    fit.w1 <- fitdist(data[[i]]*scaling,"weibull", method = "mle")
    fit_store_scale[i] <- fit.w1$estimate[[2]]*1/scaling
    #1/scaling is needed for correcting scale parameter
    fit_store_shape[i] <- fit.w1$estimate[[1]]
}

我有三个数据向量，它们存储在一个数据框中。现在我想使用该fitdist函数分别估计每列数据（和）的比例和形状参数data1，最后分别将它们存储在和中。data2data3fit_store_scalefit_store_shape

这里的问题是，fitdist如果没有适当的比例因子，该函数将无法工作data1，data2并且data3需要不同的因子。我正在寻找一种解决方案来自动为每列数据确定最佳比例因子，从而让fitdist函数最终工作。

score 2 · Accepted Answer

如果您不是绝对fitdist喜欢，则可以使用更强大的东西——以下将 Weibull 与对数尺度上的参数拟合，并使用 Nelder-Mead 而不是基于梯度的方法。拟合这些数据似乎没有任何问题。

dd <- data.frame(data1,data2,data3)
library("bbmle")
fx <- function(x) {
    m1 <- mle2(y~dweibull(shape=exp(logshape),scale=exp(logscale)),
           data=data.frame(y=x),start=list(logshape=0,logscale=0),
           method="Nelder-Mead")
    exp(coef(m1))
}
t(sapply(dd,fx))  ## not quite the output format you asked for,
                  ##  but easy enough to convert.
##       logshape     logscale
## data1 1.565941 6.589057e-05
## data2 1.565941 6.589054e-06
## data3 1.565941 6.589055e-07

对于您有标准分布 ( d*()) 函数的任何分布，这种方法应该可以很好地工作。

score 1 · Accepted Answer

解决此问题的一种方法是继续尝试通过缩放来拟合分布10^j：

for(i in 1:3)
{
  j <- 0
  while(inherits(try(fitdist(data[[i]] * 10^j, "weibull", method = "mle"), silent = TRUE), "try-error"))
  {
    j <- j + 1
  }
  cat("\nFor data[[", i, "]], used j =", j, "\n\n")
  fit.w1 <- fitdist(data[[i]] * 10^j, "weibull", method = "mle")
  fit_store_scale[i] <- fit.w1$estimate[[2]] * 1/10^j
  #1/scaling is needed for correcting scale parameter
  fit_store_shape[i] <- fit.w1$estimate[[1]]
}


# For data[[ 1 ]], used j = 2 
# For data[[ 2 ]], used j = 3 
# For data[[ 3 ]], used j = 4 

# > fit_store_scale
# [1] 6.590503e-05 6.590503e-06 6.590503e-07
# > fit_store_shape
# [1] 1.56613 1.56613 1.56613

也就是说，对于data[[1]]，我们成功地使用了j = 2（按比例缩放10^2 == 100），对于data[[2]]，我们使用j = 3 == 10^3 == 1,000了，对于data[[3]]，我们使用了j = 4 == 10^4 == 10,000。

归根结底，这将找到 10 的最小幂来缩放数据并实现拟合。有关此方法/主题的变体，请参见#14下面的示例。?fitdist

r - 使用 fitdist 函数（fitdistrplus 包）估计比例和形状参数的缩放

2 回答 2

Related

Reference