When you use Forward selection with
validation as the stepwise procedure, Minitab provides a plot of the
R^{2} statistic for the training data set and either the test R^{2}
statistic or the k-fold stepwise R^{2} statistic for each step in the model
selection procedure. The display of the test R^{2} statistic or the k-fold
stepwise R^{2} statistic depends on whether you use a test data set or k-fold
cross-validation.

Use the
plot to compare the values of the different R^{2} statistics at each step.
Typically, the model performs well when the R^{2} statistics are both large.
Minitab displays regression statistics for the model from the step that maximizes
either the test R^{2} statistic or the k-fold stepwise R ^{2}
statistic. The plot shows whether any simpler models fit well enough that they can
also be good candidates.

In a case where the model is overfit, the test
R^{2} statistic or the k-fold stepwise R^{2} statistic starts to
decrease as terms enter the model. This decrease happens while the corresponding
training R^{2} statistic or R^{2} statistic for all the data
continues to increase. An over-fit model occurs when you add terms for effects that
are not important in the population. An overfit model may not be useful for making
predictions about the population. If a model is overfit, you can consider models
from earlier steps.

The following plot shows test R^{2} as an example.
Initially, the R^{2} statistics are both close to 70%. For the first few
steps, the R^{2} statistics both tend to increase as terms enter the model.
At step 6, the test R^{2} statistic is about 88%. The maximum value of the
test R^{2} statistic is at step 14 and has a value close to 90%. You can
consider whether the improvement in the fit justifies the additional complexity from
adding more terms to the model.

After step 14, while the R^{2}
continues to increase, the test R^{2} does not. The decrease in the test
R^{2} after step 14 indicates that the model is overfit.