This command is available with the Predictive Analytics Module. Click here for more information about how to activate the module.

The analysis builds as many basis functions as you specify, with a small modification to the model from the information in each function. If the analysis includes a validation method, then the analysis calculates the value of the model selection criterion for the training data and the test data for each number of basis functions. The optimal value from the test data determines the number of functions in the optimal model.

Optimization criteria, such as the maximum R^{2}, tend to be optimistic when
you calculate them with the same data that you use to fit a model. Model validation
methods leave a portion of the data out of the model fitting process, then calculate
statistics that evaluate the performance of the model on the omitted data. Model
validation techniques provide a better estimate of how well models perform on new
data. Depending on your selection of the loss function for the analysis, the
criterion is the maximum R^{2} or the least Mean Absolute Deviation (MAD).
Minitab offers two validation methods: k-fold cross-validation and validation with a
separate test set.

K-fold cross-validation is the default method in Minitab when the data have 2000 cases or less. Because the process repeats K times, cross-validation is usually slower than validation with test data.

To complete K-fold cross-validation, Minitab Statistical Software uses the
following steps.

- Portion the data into K random subsets of as equal size as possible. The subsets are called folds.
- For fold
*k*,*k = 1, ..., K*, add basis functions using the remaining*K*–1 folds of data. Calculate the value of the model selection criterion for the model with the data in the*k*^{th}fold. - Repeat step 2 for all
*K*folds. - Average the values of the
model selection criterion across
*K*folds for each number of functions. The number of functions with the best average value makes the optimal model.

In validation with a test set, a portion of the data is set aside for validation. The remaining data is the training set. First, Minitab adds basis functions with the training set. Then, Minitab calculates the values of the model selection criterion for each number of functions using the test set. The number of functions with the best value makes the optimal model.

Without any validation, Minitab uses the entire data set to fit the model. The final model usually contains the largest number of basis functions.