Methods and formulas for Johnson transformed data in Normal Capability Analysis for Multiple Variables

Algorithm for Johnson transformation

The Johnson transformation optimally selects one of three families of distribution to transform the data to follow a normal distribution.

Johnson family	Transformation function	Range
S_B	γ + η ln [(x – ε) / (λ + ε – x)]	η, λ > 0, –∞ < γ < ∞ , –∞ < ε < ∞, ε < x < ε + λ
S_L	γ + η ln (x – ε)	η > 0, –∞ < γ < ∞, –∞ < ε < ∞, ε < x
S_U	γ + η Sinh^–1 [(x – ε) / λ] , where Sinh^–1(x) = ln [x + sqrt (1 + x²)]	η, λ > 0, –∞ < γ < ∞, –∞ < ε < ∞, –∞ < x < ∞

The algorithm uses the following procedure:

Considers almost all potential transformation functions from the Johnson system.
Estimates the parameters in the function using the method described in Chou, et al.¹
Transforms the data using the transformation function.
Calculates Anderson-Darling statistics and the corresponding p-value for the transformed data.
Selects the transformation function that has the largest p-value that is greater than the p-value criterion (default is 0.10) that you specify in the Transform dialog box. Otherwise, no transformation is appropriate.

Notation

Term	Description
S_B	The Johnson family distribution with the variable bounded (B)
S_L	The Johnson family distribution with the variable lognormal (L)
S_U	The Johnson family distribution with the variable unbounded (U)

For more information on the Johnson transformation, see Chou, et al.¹ Minitab replaces the Shapiro-Wilks normality test used in that text with the Anderson-Darling test.

For information on the probability plot, percentiles, and their confidence intervals, go to Methods and formulas for distributions in Individual Distribution Identification.

¹ Y. Chou, A.M. Polansky, and R.L. Mason (1998). "Transforming Nonnormal Data to Normality in Statistical Process Control", Journal of Quality Technology, 30, April, 133–141.