Methods and formulas for stepwise in Fit Regression Model

Select the method or formula of your choice.

Stepwise method

Performs variable selection by adding or deleting predictors from the existing model based on the F-test. Stepwise is a combination of forward selection and backward elimination procedures. Stepwise selection does not proceed if the initial model uses all of the degrees of freedom.

Variables to remove

Minitab calculates an F-statistic and p-value for each variable in the model. If the model contains j variables, then F for any variable, xr , is this formula:

Notation

TermDescription
SSE(jXr ) SS Error for the model that does not contain xr
SSE j SS Error for the model that contains xr
MSE j MS Error for the model that contains xr

If the p-value for any variable is greater than the value specified in Alpha to remove, then Minitab removes the variable with the largest p-value from the model, calculates the regression equation, displays the results, and initiates the next step.

Variables to add

If Minitab cannot remove a variable, the procedure attempts to add a variable. Minitab calculates an F-statistic and p-value for each variable that is not in the model. If the model contains j variables, then F for any variable, xa, is this formula:

Notation

TermDescription
SSE j SS Error before xa is added to the model
SSE(j + Xa ) SS Error after xa is added to the model
Degrees of freedom for variable Xa
MSE(j + Xa ) MS Error after xa is added to the model

If the p-value corresponding to the F-statistic for any variable is smaller than the value specified in Alpha to enter, Minitab adds the variable with the smallest p-value to the model, calculates the regression equation, displays the results, then goes to a new step. When no more variables can be entered into or removed from the model, the stepwise procedure ends.

Forward selection procedure

A method for determining which terms to retain in a model. Forward selection adds variables to the model using the same method as the stepwise procedure. Once added, a variable is never removed. The default forward selection procedure ends when none of the candidate variables have a p-value smaller than the value specified in Alpha to enter.

Backward elimination procedure

A method for determining which variables to retain in a model. Backward elimination starts with the model that contains all the terms and then removes terms, one at a time, using the same method as the stepwise procedure. No variable can re-enter the model. The default backward elimination procedure ends when none of the variables included in the model have a p-value greater than the value specified in Alpha to remove. Backward elimination does not proceed if the initial model uses all of the degrees of freedom.

Forward information criteria procedure

A method for determining which variables to retain in a model. The forward information criteria procedure adds the term with the lowest p-value to the model at each step. Additional terms can enter the model in 1 step if the settings for the analysis allow consideration of non-hierarchical terms but require each model to be hierarchical. Minitab calculates the information criteria for each step. Minitab displays the results of the analysis for the model with the minimum value of the selected information criterion, either AICc or BIC. In most cases, the procedure continues until one of the following conditions occurs:
  • The procedure does not find an improvement in the criterion for 8 consecutive steps.
  • The procedure fits the full model.
  • The procedure fits a model that leaves 1 degree of freedom for error.
If you specify settings for the procedure that require a hierarchical model at each step and allow only one term to enter at a time, then the procedure continues until it either fits the full model or fits a model that leaves 1 degree of freedom for error. Minitab displays the results of the analysis for the model with the minimum value of the selected information criterion, either AICc or BIC.

Forward selection with validation

The forward selection with validation procedure depends on the validation method.

Test data set

When you use a test data set, the procedure is similar to forward selection. At each step, Minitab adds the term with the smallest p-value to the model. At the end of each step, Minitab calculates the test R2 value. At the end of the forward selection procedure, the model with the greatest test R2 value is the final model.

The procedure adds terms until one of the following conditions occurs:
  • The procedure does not find an improvement in the criterion for 8 consecutive steps.
  • The procedure fits the full model.
  • The procedure fits a model that leaves 1 degree of freedom for error.

K-fold cross-validation

With cross-validation, the procedure repeats forward selection for each fold. Forward selection continues on the first fold until 16 steps without an improvement in the sum of squares for error. For each remaining fold, forward selection continues until the procedure reaches the minimum of the following numbers:
  • The number of steps from a previous fold
  • 16 steps without an improvement in the sum of squares for error
  • The number of steps to fit the full model
  • The number of steps to fit a model that leaves 1 degree of freedom for error

Once the forward selection procedures are complete for each fold, Minitab calculates the overall k-fold stepwise R2 values for each step that is in the selection procedure for every fold. The step with the maximum k-fold stepwise R2 value becomes the step for the chosen model from a final forward selection procedure.

Last, Minitab performs forward selection on the full dataset. Minitab displays regression results for the model at the step with the maximum overall k-fold stepwise R2 value from the k-fold stepwise procedures. The table of model selection details and the graph of the k-fold stepwise R2 versus model selection step continue for 8 steps past the step for the regression results.