Example of Individual Distribution Identification

A quality engineer for a nutritional supplement company wants to assess the calcium content in vitamin capsules. The engineer collects a random sample of capsules and records their calcium content. To determine which statistical analysis is appropriate for the data, the engineer first needs to determine the data distribution.

The engineer performs individual distribution identification to determine which distribution best fits the data.

  1. Open the sample data, CalciumContent.MTW.
  2. Choose Stat > Quality Tools > Individual Distribution Identification.
  3. In Data are arranged as, select Single column, then enter Calcium.
  4. In Subgroup size, enter 1.
  5. Click OK.

Interpret the results

Minitab displays a probability plot and a p-value for each distribution and transformation. If a distribution is a good fit for the data (or if a transformation is effective), the points on the plot follow a straight line within the confidence bounds and the p-value is greater than the alpha level. An alpha level of 0.05 is often used. The p-value for the likelihood-ratio test (LRT) indicates whether adding an additional parameter to a distribution significantly improves its fit. An LRT p-value that is less than 0.05 suggests that the improvement is significant.

For these data, the 3-parameter Weibull distribution (p > 0.500) and the largest extreme value distribution (p > 0.250) are a good fit for the data. Adding a third parameter significantly improves the fit of the lognormal distribution (LRT P = 0.017), the Weibull distribution (LRT P = 0.000), the gamma distribution (LRT P = 0.006), and the loglogistic distribution (LRT P = 0.027).

The Box-Cox transformation (p = 0.324) and the Johnson transformation (p = 0.986) are effective for these data. After the transformation, the normal distribution provides a good fit for the transformed values.

2-Parameter Exponential

* WARNING * Variance/Covariance matrix of estimated parameters does not exist. The threshold
parameter is assumed fixed when calculating confidence intervals.

3-Parameter Gamma

* WARNING * Variance/Covariance matrix of estimated parameters does not exist. The threshold
parameter is assumed fixed when calculating confidence intervals.

Descriptive Statistics

NN*MeanStDevMedianMinimumMaximumSkewnessKurtosis
50050.7822.7647750.446.858.10.644923-0.287071
Box-Cox transformation: λ = -4
Johnson transformation function:
0.804604 + 0.893699 × Ln( ( X - 46.2931 ) / ( 59.8636 - X ) )

Goodness of Fit Test

DistributionADPLRT P
Normal0.7540.046 
Box-Cox Transformation0.4140.324 
Lognormal0.6500.085 
3-Parameter Lognormal0.341*0.017
Exponential20.614<0.003 
2-Parameter Exponential1.6840.0140.000
Weibull1.442<0.010 
3-Parameter Weibull0.230>0.5000.000
Smallest Extreme Value1.656<0.010 
Largest Extreme Value0.394>0.250 
Gamma0.7020.071 
3-Parameter Gamma0.268*0.006
Logistic0.7260.034 
Loglogistic0.6590.050 
3-Parameter Loglogistic0.432*0.027
Johnson Transformation0.1240.986 

ML Estimates of Distribution Parameters

DistributionLocationShapeScaleThreshold
Normal*50.78200  2.76477 
Box-Cox Transformation*0.00000  0.00000 
Lognormal*3.92612  0.05368 
3-Parameter Lognormal1.69295  0.4684944.74011
Exponential    50.78200 
2-Parameter Exponential    4.0632646.71873
Weibull  17.8247052.13681 
3-Parameter Weibull  1.476054.5364746.66579
Smallest Extreme Value52.22257  2.95894 
Largest Extreme Value49.50370  2.16992 
Gamma  351.044210.14466 
3-Parameter Gamma  2.992181.6369845.88376
Logistic50.57182  1.59483 
Loglogistic3.92259  0.03121 
3-Parameter Loglogistic1.54860  0.3276345.46180
Johnson Transformation*0.02897  0.97293 
* Scale: Adjusted ML estimate