Methods and formulas for Johnson transformed data in Normal Capability Analysis for Multiple Variables

Algorithm for Johnson transformation

The Johnson transformation optimally selects one of three families of distribution to transform the data to follow a normal distribution.

Johnson family Transformation function Range
SB γ + η ln [(x – ε) / (λ + ε – x)] η, λ > 0, –∞ < γ < ∞ , –∞ < ε < ∞, ε < x < ε + λ
SL γ + η ln (x – ε) η > 0, –∞ < γ < ∞, –∞ < ε < ∞, ε < x
SU γ + η Sinh–1 [(x – ε) / λ] , where

Sinh–1(x) = ln [x + sqrt (1 + x2)]

η, λ > 0, –∞ < γ < ∞, –∞ < ε < ∞, –∞ < x < ∞

The algorithm uses the following procedure:

  1. Considers almost all potential transformation functions from the Johnson system.
  2. Estimates the parameters in the function using the method described in Chou, et al.1
  3. Transforms the data using the transformation function.
  4. Calculates Anderson-Darling statistics and the corresponding p-value for the transformed data.
  5. Selects the transformation function that has the largest p-value that is greater than the p-value criterion (default is 0.10) that you specify in the Transform dialog box. Otherwise, no transformation is appropriate.

Notation

TermDescription
SBThe Johnson family distribution with the variable bounded (B)
SLThe Johnson family distribution with the variable lognormal (L)
SUThe Johnson family distribution with the variable unbounded (U)

For more information on the Johnson transformation, see Chou, et al.1 Minitab replaces the Shapiro-Wilks normality test used in that text with the Anderson-Darling test.

For information on the probability plot, percentiles, and their confidence intervals, go to Methods and formulas for distributions in Individual Distribution Identification.

1 Y. Chou, A.M. Polansky, and R.L. Mason (1998). "Transforming Nonnormal Data to Normality in Statistical Process Control", Journal of Quality Technology, 30, April, 133–141.