0
library(VGAM)
library(fitdistrplus)

fitdist(u_NI$k_u, 'truncpareto',
         start = list(lower=1,
                      upper=42016,
                      shape=1)) -> fit.k_u

length(u_NI$k_u) = 637594

我收到了这个错误:

<simpleError in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data,     gr = gradient, ddistnam = ddistname, hessian = TRUE, method = meth,     lower = lower, upper = upper, ...): function cannot be evaluated at initial parameters>
Error in fitdist(u_NI$k_u, "truncpareto", start = list(lower = 1, upper = 42016,  : 
  the function mle failed to estimate the parameters, 
                with the error code 100
In addition: Warning messages:
1: In fitdist(u_NI$k_u, "truncpareto", start = list(lower = 1, upper = 42016,  :
  The dtruncpareto function should return a zero-length vector when input has length zero
2: In fitdist(u_NI$k_u, "truncpareto", start = list(lower = 1, upper = 42016,  :
  The ptruncpareto function should return a zero-length vector when input has length zero

问题出在数据集过大还是起始参数上?

可重复的例子:

library(VGAM)
library(fitdistrplus)

rtruncpareto(100,1,100,1.5) -> a
fitdist(a, "truncpareto",
        start = list(lower=1,
                     upper=100,
                     shape=1.5))

这是行不通的,我不明白为什么。

这里似乎有问题:

argument 'lower' must be positive

4

2 回答 2

2

您的问题的一部分是一个更普遍的问题,即对于截断分布,边界参数的 MLE 通常等于观察到的数据集的最小值/最大值。因此,您应该始终能够通过将下限/上限的值设置为等于最小值/最大值来做到这一点(根据我的经验,它们必须略低于/高于观察到的界限)。(我还发现我必须设置lower = 0阻止算法尝试形状参数的负值。)

library(VGAM); library(fitdistrplus)
set.seed(101)
rtruncpareto(100,1,100,1.5) -> a
eps <- 1e-8
fitdist(a, "truncpareto",
        start = list(shape=1.5),
        fix.arg = list(lower = min(a) - eps, upper = max(a) + eps),
        lower = 0)
Parameters:
      estimate Std. Error
shape 1.349885  0.1554436
Fixed parameters:
          value
lower  1.006844
upper 25.906577

一个替代方案fitdistbbmle

library(bbmle)
m1 <- mle2(a ~ dtruncpareto(lower = min(a) - eps,
                            upper = max(a) + eps,
                            shape = exp(lshape)),
           start = list(lshape = 0),
           data = data.frame(a),
         method = "BFGS")
exp(coef(m1))   
  lshape 
1.349884 

bbmle更灵活一点,允许您在对数尺度上拟合形状参数,这通常更稳健(并使标准偏差估计更有用)。在这里使用method = "BFGS"是因为默认的 Nelder-Mead 算法对于一维优化效果不佳;也可以使用method = "Brent"(这将更有效和更健壮),但随后需要为lshape参数提供明确的下限和上限。

于 2022-02-05T20:39:03.683 回答
2

我不太确定,但这可能是你的追求。

library(fitdistrplus)
library(VGAM)
library(ggplot2)

u_NI <- data.frame("k_u" = rtruncpareto(10000,
                                        lower = 1,
                                        upper = 100,
                                        shape = 1.5))

fit <- vglm(k_u ~ 1,
            truncpareto(lower=1, upper=max(u_NI$k_u) + 1),
            data = u_NI,
            trace = TRUE)

x <- seq(1, max(u_NI$k_u), length.out = 10000)
y <- dtruncpareto(x, shape = fit@coefficients[["(Intercept)"]],
                  lower=fit@extra$lower,
                  upper=fit@extra$upper)
pareto <- cbind.data.frame(x, y)

ggplot()+
  geom_density(data = u_NI, aes(k_u))+
  geom_line(data = pareto, aes(x = x, y = y, color = "truncpareto"),linetype = 5, size = 1.3)+
  theme(legend.position = c(.95, .95),
        legend.justification = c(1, 1),
        legend.title = element_blank())

########################## 编辑:

# fix lower and upper boundary, estimate may still fail, depending on     the start value of shape
fitdist(u_NI$k_u, 'truncpareto',
        method = "mle",
        start = list(shape=1),
        fix.arg=list(lower=1, upper=max(u_NI$k_u) + 0.01),
        control=list(trace=1, REPORT=1)) -> fit.k_u

# use different method to estimate parameter, mge seems to work
fitdist(u_NI$k_u, 'truncpareto',
        method = "mge",
        start = list(shape=1,
                     lower=1,
                     upper=max(u_NI$k_u) + 1),
        control=list(trace=1, REPORT=1)) -> fit.k_u

剩下的错误是指这个:

> pnorm(numeric(), 1,10)
numeric(0)
> ptruncpareto(numeric(), 1,10,5)
[1] NA

####### Edit2:我想我发现了原来的错误,它有点令人困惑。但是,mle 还有一个较低的参数,它与分布的较低参数分开。它应该将参数估计限制为 >= 0。因此,这应该可以工作,但是需要很长时间,即使只有 10,000 个值:

fitdist(u_NI$k_u, 'truncpareto',
        method = "mle",
        start = list(shape=1,
                     lower=1,
                     upper=max(u_NI$k_u) +1 ),
        control=list(trace=1, REPORT=1),
        lower = 0) -> fit.k_u
于 2022-02-05T17:08:46.780 回答