Using the p-value for a goodness-of-fit test to choose a distribution or transformation

In This Topic

Interpret a goodness-of-fit test and choose a distribution
Choose between a 3-parameter and a 2-parameter distribution
Why are some p-values given as an approximation rather than an exact value?
Why are some p-values shown as asterisks in the output?

Interpret a goodness-of-fit test and choose a distribution

For a significance level, α, chosen before you conduct your test, a p-value (P) less than α indicates that the data do not follow that distribution.

Minitab performs goodness-of-fit tests on your data for a variety of distributions and estimates their parameters. Choose the distribution that best fits your data, and is most appropriate for your analysis. If more than one distribution fits your data, select the distribution with the largest p-value. If no distribution fits your data, consider a nonparametric analysis.

Look only at the basic distributions first (do not consider distributions with threshold parameters, like the 2-parameter exponential or 3-parameter lognormal).
Determine which distributions have the highest p-values. If none of the distributions have p-values greater than your alpha value (0.05), then none of the distributions adequately fit your data.
Consider the 2-parameter and 3-parameter variations of the distributions that seem to be adequate.

Among very close p-values, select either:

A distribution that you have used previously for a similar data set.
A distribution based on capability statistics.
The distribution that is most conservative.

Choose between a 3-parameter and a 2-parameter distribution

For every 3-parameter distribution except the Weibull distribution, there is no established method for calculating the p-value, so you must use the likelihood-ratio test (LRT).

First examine the p-value for the corresponding two-parameter distribution to evaluate the fit.
Next examine the LRT p-value for the 3-parameter distribution to determine whether the 3-parameter distribution is significantly better than the two-parameter distribution. For Individual Distribution Identification, a likelihood ratio test p-value (LRT P) less than alpha indicates that, for distributions that have an optional extra parameter, adding this extra parameter significantly improves the distribution's fit. For example, LRT P helps you choose between the exponential distribution (which has 1 parameter) and the 2-parameter exponential distribution, or between the Weibull distribution (which has 2 parameters) and the 3-parameter Weibull distribution.
Also, a visual inspection of the probability plot combined with the Anderson-Darling value can help indicate whether the distribution is a good fit. However, it may be better to choose a distribution which has a calculated p-value and a similar Anderson-Darling value.

Why are some p-values given as an approximation rather than an exact value?

For some distributions a closed form expression for the p-value exists and thus an exact p-value can be obtained. However, for certain other distributions a closed form expression does not exist but tables of ranges of p-values, obtained through simulation studies, are available. For these distributions Minitab can only report a lower and/or upper bound for the p-value.

Why are some p-values shown as asterisks in the output?

An asterisk is displayed instead of a p-value for the 3-parameter lognormal, 3-parameter gamma, and 3-parameter loglogistic distributions. The asterisk indicates that Minitab cannot calculate a p-value for that distribution.