Methods for Partial Least Squares Regression

Select the method of your choice.

Model fitting

Minitab uses the nonlinear iterative partial least squares (NIPALS) algorithm developed by Herman Wold1 to solve problems associated with ill-conditioned data. PLS reduces the number of predictors by extracting uncorrelated components based on the covariance between the predictor and response variables. PLS is similar to principal components regression and ridge regression, but varies in its computational method.

The PLS algorithm produces a sequence of models, where each consecutive model contains one additional component. Components are calculated one at a time, starting with the standardized x- and y-matrix. Subsequent components are calculated from the x- and y-residual matrix; iterations stop upon reaching the maximum number of components or when x-residuals become the zero matrix. If the number of components equals the number of predictors, the PLS model equals the least squares regression model. Cross-validation is used to identify the number of components that minimizes prediction error.

PLS performs decomposition on both predictors and responses simultaneously. After Minitab determines the number of components and calculates the loadings, it calculates the regression coefficients for each predictor. For more detailed information on PLS and NIPALS see234.

Cross-validation

Calculates the predictive ability of potential models to help you determine the appropriate number of components to retain in your model. When the data contain multiple response variables, Minitab validates the components for all responses simultaneously.

Cross-validation procedure

For each potential model, Minitab:

  1. Omits one observation or group of observations, depending on the cross-validation method you use.
  2. Recalculates the model without the observation/group of observations.
  3. Predicts the response, or the cross-validated fitted value, for the omitted observation/group of observations using the recalculated model and calculates the cross-validated residual value.
  4. Repeats steps 1-3 until all observations have been omitted and fit.
  5. Calculates the prediction sum of squares (PRESS) and predicted R2 values.

After performing steps 1-5 for each model, Minitab selects the model with the number of components that produces the highest predicted R2 and lowest PRESS. With multiple response variables, Minitab selects the model with the highest average predicted R2 and lowest average PRESS.

1 H. Wold (1975). "Soft Modeling by Latent Variables; the Nonlinear Iterative Partial Least Squares Approach," in Perspectives in Probability and Statistics, Papers in Honour of M.S. Bartlett, ed. J. Gani, Academic Press.
2 P. Geladi and B. Kowalski (1986). "Partial Least-Squares Regression: A Tutorial," Analytica Chimica Acta, 185, 1-17.
3 A. Hoskuldsson (1988). "PLS Regression Methods," Journal of Chemometrics, 2, 211-228.
4 A. Lorber, L. Wangen, and B. Kowalski (1987). "A Theoretical Foundation for the PLS Algorithm," Journal of Chemometrics, 1, 19-31.