How to compare appropriateness of fitted Weibull and lognormal models via the likelihood ratio test

18 views (last 30 days)
I'm working with some data that has historically been modeled by Weibull and lognormal distributions. I have used the maximum likelihood estimation method to estimate parameters of both distributions which would describe my data. Now, I wish to determine which model is more apprproprite, i.e. has a better fit. I'd like to use the likelihood ratio to do this, and have been looking at the functions negloglike() and lratiotest() (which is from the econometrics toolbox) for doing this.
Currently I have calculated the negative log likelihood for both fit models via the negloglike() funciton and have also tried to use these values as the unrestricted and restricted model parameters with the lratiotest() function. Due to my naivity in this field and the lack of similar examples I am unsure if I'm using these tools correctly. The stripped-down code is provided below. For the momement, I'm using randomly generated data to prove-out the code. I expect, as the data is generated from a Weibull distribution, that I should reject the null hypothesis (that the restricted Wiebull fitted distr. is more appropriate) and accept the alternative (that the unrestricted lognormal fitted distr. is more appropriate). I've noticed that, even when the MLE parameters for the estimated Weibull distribution are very close to the population parameters, I still reject the null. I believe the correct degrees of freed om for the test is 2, but I may be wrong there too.
Thanks ahead of time for any assistance!
scl = 125e3; shp = 2.5;, n = 100;
x = wblrnd(scl, shp, n, 1);
wblPD = fitdist(x, 'weibull');
lnPD = fitdist(x, 'lognormal');
wblPms = mle(x, 'distribution', 'weibull');
lnPms = mle(x, 'distribution', 'lognormal');
rLogL = negloglik(wblPD)
uLogL = negloglik(lnPD)
dof = 2;
[h, pValue] = lratiotest(uLogL, rLogL, dof)

Accepted Answer

Jeff Miller
Jeff Miller on 28 May 2020
Unfortunately the likelihood ratio test cannot be used this way. It is only for comparing nested models (i.e., one model has to be a logical subset of the other--for example, with some parameter constrained to a particular value in the subset model but that same parameter free to take on any value in the larger model). The Weibull and lognormal distributions do not have this subset relationship, so they can't be compared with an lrt. For non-nested models, you probably have to make comparisons using AIC or some such model goodness measure.
I don't really understand what you are saying about rejecting the null. It seems to me that if you generate the data from a Weibull then you wouldn't expect to reject the Weibull in favor of some other distribution like the lognormal. (Well, maybe if the other distribution were much more flexible than the Weibull, but that doesn't apply to the lognormal AFAIK).
By the way, these statements are redundant:
wblPD = fitdist(x, 'weibull');
lnPD = fitdist(x, 'lognormal');
wblPms = mle(x, 'distribution', 'weibull');
lnPms = mle(x, 'distribution', 'lognormal');
The outputs of fitdist are structures containing the very same parameter estimates that you get from mle.
I hope some of this helps you to move forward.
  2 Comments
Jaime Berez
Jaime Berez on 28 May 2020
Hi Jeff, thanks for the reply!
You are certainly right that mle and fitdist work the same way, looks like I just haven't paired-down my code since first writing it.
I've noted what you said about nested models in the documentation, and I see what you mean about how that doesn't apply here. Further ressearch also highlights this nested model requirement... looks like this isn't the right tool for ths siutaiton.
That said, it looks like the test could be applied to comparing distributions from the same family (i.e. two weibull distribtuions fit to two different groups of data) to see if there is a statistical difference between the fitted distributions. This is also of interest to me. See example iii) on this page from NIST:
In this case, if I calculated the negative log-likelihood for each fitted distr. and compared them with the lik. ratio test and a DoF of 2, would you agree that this adheres to the restrictions of the test? I've tried doing that, and comparing it to what I believe is the correct manual calculation of the lik. ratio test statistic. I'd expect lratiotest and my manual calculation to produce the same p value, but they are not... do you see any issues!
scl = 125e3; shp = 2.5;, n = 100;
x = wblrnd(scl, shp, n, 1);
wblPD = fitdist(x, 'weibull');
x2 = wblrnd(scl + 0.2*scl, shp + 0.2*shp, n, 1);
wblPD2 = fitdist(x2, 'weibull');
rLogL = negloglik(wblPD)
uLogL = negloglik(wblPD2)
dof = 2;
[h, pValue] = lratiotest(uLogL, rLogL, dof)
ratio = 2*(rLogL - uLogL) % 2*difference in negative log-loglihoods
chi2 = ratio % Should calculate chi2 stat for this test
p = chi2cdf(ratio, n) % Should match pVal reported by lratiotest
chi2Crit = chi2inv(0.95, n) % Crtical chi2 val for significance
Jeff Miller
Jeff Miller on 29 May 2020
No, sorry, I don't see how this comparison of two distributions fits into the nested models scenario either.
Even if it did, I think this line
p = chi2cdf(ratio, n) % Should match pVal reported by lratiotest
should instead be
p = 1 - chi2cdf(ratio, n) % Should match pVal reported by lratiotest

Sign in to comment.

More Answers (0)

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!