I want to know if the following distributions fit the given data well or not. I used the Kolmogorov–Smirnov (K–S) statistic for the following two distributions but, i obtained different p-value of k-s test than in the original paper.
In another words my question is why the p-value of k-s test is different from my code than the published paper. First I have note the first distribution has parameter "a" where [0 < a=exp(- theta) < 1 where theta >0 ]. I do not know if this the reason for the different result in the p-value . The first distribution has the following log-likelihood function, cumulative distribution function and finally the k.s-test
The data is
d<-rep(c(0:5), c(447,132,42,21,3,2))
loglik <-function(param){
if(any(param <= 0)){
NaN
} else {
a <- param[1] **#where 0<a=exp(-theta)<1**
b <- param[2]
first = (1-a^d + ((1+d)*a^d -1)*log(a))^b
second = (1-a^(d+1) + ((1+(d+1))*a^(d+1) -1)*log(a))^b
log = -length(d)*b*log(1-log(a)) + sum(log(second-first))
return(log)
}
}
# maximum likelihood estimation using maxLik function
library(maxLik)
start_param <- c(a=0.05, b=0.02) #used in paper
mle <- maxLik(loglik, start = start_param, control=list(printLevel=2))
# the cumulative distribution function
cdf <- function(d, param){
if(any(param <= 0)){
NaN
} else {
a <- param[1]
b <- param[2]
cdf= ((1-a^(d+1)+((2+d)*a^(d+1)-1)*log(a))^b)/(1-log(a))^b
}
}
#one-sample kolmogorov smirnov test
ks <- ks.test(d, "cdf", coef(mle))
My results, first for maximum likelihood then for k-s test:
----- Initial parameters: -----
fcn value: -1548.406
parameter initial gradient free
a 0.05 5684.184 1
b 0.02 9952.484 1
Condition number of the (active) hessian: 3.957421
-----Iteration 1 -----
-----Iteration 2 -----
-----Iteration 3 -----
-----Iteration 4 -----
-----Iteration 5 -----
-----Iteration 6 -----
-----Iteration 7 -----
-----Iteration 8 -----
-----Iteration 9 -----
-----Iteration 10 -----
-----Iteration 11 -----
--------------
successive function values within tolerance limit
11 iterations
estimate: 0.2632637 0.6928542
Function value: -591.9145
My results for k-s test
One-sample Kolmogorov-Smirnov test
data: d
D = 0.69073, p-value < 2.2e-16
alternative hypothesis: two-sided
My results gave insignificant p-value which mean we reject this distribution to fit this data.
CDF of the previous distribution in published paper
The results in the published paper as follows The results of p-value which is significant
The Second distribution: The second distribution has the following log-likelihood function, cumulative distribution function and finally the k.s-test The data d <- c(5, 11, 21, 31, 46, 75, 98, 122, 145, 165, 196, 224, 245, 293, 321, 330, 350, 420)
loglik.d <-function(param){
if(any(param <= 0)){
NaN
} else {
alp <- param[1]
th <- param[2]
lam <- param[3]
bet <- param[4]
first = exp(-th*d^bet)/(1+alp*exp(lam*d))
second = exp(-th*(d+1)^bet)/(1+alp*exp(lam*(d+1)))
log = length(d)*log(1+alp)+ sum(log(first-second))
return(log)
}
}
# maximum likelihood estimation using maxLik function
library(maxLik)
start_param <- c(alp = 1.072*10^(-5), th = 9.669*10^(-3), lam = 0.0331, bet=0.8571) #used in paper
mle.d <- maxLik(loglik.d, start = start_param, control=list(printLevel=2))
# the cumulative distribution function of the discrete distribution
cdf <- function(d, param){
if(any(param <= 0)){
NaN
} else {
alp <- param[1]
th <- param[2]
lam <- param[3]
bet <- param[4]
cdf= 1-((1+alp)*exp(-th*(d+1)^bet)/(1+alp*exp(bet*(d+1))))
}
}
#one-sample kolmogorov smirnov test
ks.d <- ks.test(d, "cdf", coef(mle.d))
My results
----- Initial parameters: -----
fcn value: -108.1909
parameter initial gradient free
alp 1.072e-05 2452.9343660 1
th 9.669e-03 -10.8047669 1
lam 3.310e-02 2.2440943 1
bet 8.571e-01 0.1406428 1
Condition number of the (active) hessian: 402174154
-----Iteration 1 -----
-----Iteration 2 -----
-----Iteration 3 -----
-----Iteration 4 -----
-----Iteration 5 -----
--------------
successive function values within tolerance limit
5 iterations
estimate: 2.61131e-05 0.008173223 0.03056733 0.8839012
Function value: -108.1732
k-s test result:
One-sample Kolmogorov-Smirnov test
data: d
D = 0.88877, p-value = 2.22e-16
alternative hypothesis: two-sided
The cumulative distribution function The log-likelihood function
The results of p-value from the paper is p-value from published paper which is 1 i.e, based on the p-value of paper, this distribution is good to fit this data but my results give another result which is this distribution is not good to fit this data. For this distribution, all parameters are greater than zero.
Any help to know where the problem.