X- and y-statistics for Partial Least Squares Regression

Find definitions and interpretation guidance for every statistic in the X- and y-statistics table.

X-calculated values

The x-calculated values are linear combinations of the x-scores. The x-calculated values contain the variance in the terms explained by the PLS regression model. Observations with relatively small x-calculated values are outliers in the x-space and are not well explained by the model.

The x-calculated matrix, similar to the original x-matrix, is an n x p matrix, where n = number of observations and p = number of terms. The x-calculated values are on the same scale as the predictors.

If the number of components equals the number of terms, then x-calculated value equals the original x-value.

X-loadings

The x-loadings are the linear coefficients that link the terms to the x-scores. The x-loadings indicate the importance of the corresponding term to the mth component. X-loadings, which are similar to eigenvectors in principal components analysis, form a p x m matrix, where p = number of terms and m = number of components.

X-residuals

The x-residuals contain the variance in the predictors not explained by the PLS regression model. Observations with relatively large x-residuals are outliers in the x-space, indicating that they are not well explained by the model.

The x-residuals are the differences between the actual values each term and the x-calculated values and are on the same scale as the original predictors. The x-residual matrix, similar to the original x-matrix, is an n x p matrix, where n = number of observations and p = number of terms.

X-scores

The x-scores are linear combinations of the terms in the model. The x-scores, which are similar to principal component scores, form an n x m matrix of uncorrelated columns, where n = number of observations and m = number of components. The x-scores are projections of the observations on the PLS regression components. PLS regression fits the x-scores, which replace the original terms in the model, using least squares estimation.

X-variance

The x-variance is the amount of variance in the terms that is explained by the model. The x-variance value is between 0 and 1.

The closer the x-variance value is to 1, the better the components represent the original set of terms. If you have more than 1 response, the x-variance value is the same for all responses.

X-weights

The X-weights describe the covariance between the predictors and responses. In the algorithm, the x-weights are used to ensure the x-scores are orthogonal, or unrelated to each other. The x-weights, which are used to calculate the x-scores, form a p x m matrix, where p = number of terms and m = number of components.

Y-calculated values

Y-calculated values are linear combinations of the x-scores. The y-calculated values contain the variance in the responses explained by the PLS regression model. Observations with relatively small y-calculated values are outliers in the y-space and are not well explained.

The y-calculated matrix, like the original y-matrix, is an n x r matrix, where n = number of observations and r = number of responses. The y-calculated values are on the same scale as the responses.

Y-loadings

The y-loadings are the linear coefficients that link the responses to the y-scores. The y-loading values denote the importance of the corresponding response to the mth component. Y-loadings form an r x m matrix, where r = number of responses and m = number of components.

Y-residuals

The y-residuals contain the remaining variance in the responses not explained by the PLS regression model. Observations with relatively large y-residuals are outliers in the y-space, indicating that they are not well explained by the model.

The y-residuals are the differences between the actual response values and y-calculated values, and are on the same scale as the original responses. The y-residual matrix, similar to the original y-matrix, is an n x r matrix, where n = number of observations and r = number of responses.

Y-scores

The y-scores are linear combinations of the response variables. The y-scores form an n x m matrix, where n = number of observations and m = number of components. The y-scores are projections of the observations on the PLS regression components.