0

I want to know if the following distributions fit the given data well or not. I used the Kolmogorov–Smirnov (K–S) statistic for the following two distributions but, i obtained different p-value of k-s test than in the original paper.

In another words my question is why the p-value of k-s test is different from my code than the published paper. First I have note the first distribution has parameter "a" where [0 < a=exp(- theta) < 1 where theta >0 ]. I do not know if this the reason for the different result in the p-value . The first distribution has the following log-likelihood function, cumulative distribution function and finally the k.s-test

The data is

d<-rep(c(0:5), c(447,132,42,21,3,2))

 loglik <-function(param){   
  if(any(param <= 0)){
    NaN
  } else {
    a <- param[1]   **#where 0<a=exp(-theta)<1**
    b <-  param[2]
    
    first = (1-a^d + ((1+d)*a^d -1)*log(a))^b
    second = (1-a^(d+1) + ((1+(d+1))*a^(d+1) -1)*log(a))^b
    log =  -length(d)*b*log(1-log(a)) + sum(log(second-first))
    return(log)
  }
}

# maximum likelihood estimation using maxLik function
library(maxLik)

start_param <- c(a=0.05, b=0.02) #used in paper

mle <- maxLik(loglik, start = start_param, control=list(printLevel=2)) 


# the cumulative distribution function

cdf <- function(d, param){    
  if(any(param <= 0)){
    NaN
  } else {
    a <- param[1]
    b <-  param[2]
cdf= ((1-a^(d+1)+((2+d)*a^(d+1)-1)*log(a))^b)/(1-log(a))^b    
  }
}

#one-sample kolmogorov smirnov test
ks <- ks.test(d, "cdf", coef(mle))

My results, first for maximum likelihood then for k-s test:

----- Initial parameters: -----
fcn value: -1548.406 
  parameter initial gradient free
a      0.05         5684.184    1
b      0.02         9952.484    1
Condition number of the (active) hessian: 3.957421 
-----Iteration 1 -----
-----Iteration 2 -----
-----Iteration 3 -----
-----Iteration 4 -----
-----Iteration 5 -----
-----Iteration 6 -----
-----Iteration 7 -----
-----Iteration 8 -----
-----Iteration 9 -----
-----Iteration 10 -----
-----Iteration 11 -----
--------------
successive function values within tolerance limit 
11  iterations
estimate: 0.2632637 0.6928542 
Function value: -591.9145 

My results for k-s test

    One-sample Kolmogorov-Smirnov test

data:  d
D = 0.69073, p-value < 2.2e-16
alternative hypothesis: two-sided

My results gave insignificant p-value which mean we reject this distribution to fit this data.

CDF of the previous distribution in published paper

The log-likelihood function

The results in the published paper as follows The results of p-value which is significant

The Second distribution: The second distribution has the following log-likelihood function, cumulative distribution function and finally the k.s-test The data d <- c(5, 11, 21, 31, 46, 75, 98, 122, 145, 165, 196, 224, 245, 293, 321, 330, 350, 420)

loglik.d <-function(param){   
  if(any(param <= 0)){
    NaN
  } else {
    alp <- param[1]
    th <-  param[2]
    lam <- param[3]
    bet <- param[4]
    
    first = exp(-th*d^bet)/(1+alp*exp(lam*d))
    second = exp(-th*(d+1)^bet)/(1+alp*exp(lam*(d+1)))
    log = length(d)*log(1+alp)+ sum(log(first-second))
  return(log)
    }
}

# maximum likelihood estimation using maxLik function
library(maxLik)

start_param <- c(alp = 1.072*10^(-5), th = 9.669*10^(-3), lam = 0.0331, bet=0.8571) #used in paper

mle.d <- maxLik(loglik.d, start = start_param, control=list(printLevel=2)) 


# the cumulative distribution function of the discrete distribution

cdf <- function(d, param){    
  if(any(param <= 0)){
    NaN
  } else {
    alp <- param[1]
    th <-  param[2]
    lam <- param[3]
    bet <- param[4]

   cdf= 1-((1+alp)*exp(-th*(d+1)^bet)/(1+alp*exp(bet*(d+1))))
  }
}

#one-sample kolmogorov smirnov test
ks.d <- ks.test(d, "cdf", coef(mle.d)) 

My results

----- Initial parameters: -----
fcn value: -108.1909 
    parameter initial gradient free
alp 1.072e-05     2452.9343660    1
th  9.669e-03      -10.8047669    1
lam 3.310e-02        2.2440943    1
bet 8.571e-01        0.1406428    1
Condition number of the (active) hessian: 402174154 
-----Iteration 1 -----
-----Iteration 2 -----
-----Iteration 3 -----
-----Iteration 4 -----
-----Iteration 5 -----
--------------
successive function values within tolerance limit 
5  iterations
estimate: 2.61131e-05 0.008173223 0.03056733 0.8839012 
Function value: -108.1732 

k-s test result:

One-sample Kolmogorov-Smirnov test

data:  d
D = 0.88877, p-value = 2.22e-16
alternative hypothesis: two-sided

The cumulative distribution function The log-likelihood function

The results of p-value from the paper is p-value from published paper which is 1 i.e, based on the p-value of paper, this distribution is good to fit this data but my results give another result which is this distribution is not good to fit this data. For this distribution, all parameters are greater than zero.

Any help to know where the problem.

4

0 回答 0