Methods and formulas for Johnson Transformation

Algorithm for Johnson transformation

The Johnson transformation optimally selects one of three families of distribution to transform the data to follow a normal distribution.

Johnson family	Transformation function	Range
S_B	γ + η ln [(x – ε) / (λ + ε – x)]	η, λ > 0, –∞ < γ < ∞ , –∞ < ε < ∞, ε < x < ε + λ
S_L	γ + η ln (x – ε)	η > 0, –∞ < γ < ∞, –∞ < ε < ∞, ε < x
S_U	γ + η Sinh^–1 [(x – ε) / λ] , where Sinh^–1(x) = ln [x + sqrt (1 + x²)]	η, λ > 0, –∞ < γ < ∞, –∞ < ε < ∞, –∞ < x < ∞

The algorithm uses the following procedure:

Considers almost all potential transformation functions from the Johnson system.
Estimates the parameters in the function using the method described in Chou, et al.¹
Transforms the data using the transformation function.
Calculates Anderson-Darling statistics and the corresponding p-value for the transformed data.
Selects the transformation function that has the largest p-value that is greater than the p-value criterion (default is 0.10) that you specify in the Transform dialog box. Otherwise, no transformation is appropriate.

Notation

Term	Description
S_B	The Johnson family distribution with the variable bounded (B)
S_L	The Johnson family distribution with the variable lognormal (L)
S_U	The Johnson family distribution with the variable unbounded (U)

For more information on the Johnson transformation, see Chou, et al.¹ Minitab replaces the Shapiro-Wilks normality test used in that text with the Anderson-Darling test.

For information on the probability plot, percentiles, and their confidence intervals, go to Methods and formulas for distributions in Individual Distribution Identification.

¹ Y. Chou, A.M. Polansky, and R.L. Mason (1998). "Transforming Nonnormal Data to Normality in Statistical Process Control", Journal of Quality Technology, 30, April, 133–141.