A wine producer wants to know how the chemical composition of his wine relates to sensory evaluations. He has 37 Pinot Noir samples, each described by 17 elemental concentrations (Cd, Mo, Mn, Ni, Cu, Al, Ba, Cr, Sr, Pb, B, Mg, Si, Na, Ca, P, K) and a score on the wine's aroma from a panel of judges. He wants to predict the aroma score from the 17 elements. Data are from: I.E. Frank and B.R. Kowalski (1984). "Prediction of Wine Quality and Geographic Origin from Chemical Measurements by Partial Least-Squares Regression Modeling," Analytica Chimica Acta, 162, 241 − 251.
The producer wants to include all the concentrations and all the 2-way interactions that include cadmium (Cd) in the model. Because the ratio of samples to predictors is low, the producer decides to use partial least squares regression.
The model selection plot identifies the model with 4 components as the optimal model because the 4-component model has the highest predicted R2 value. The predicted R2 values on the plot are calculated with cross-validation. The model selection and validation table shows that the predicted R2 value for the optimal model is approximately 0.56. Minitab uses the optimal model for the analysis of variance calculations. The optimal model is statistically significant at the 0.05 level of significance because the p-value is approximately 0.000.