Perform stepwise regression for Fit Regression Model and Linear Regression

Stat > Regression > Regression > Fit Regression Model > Stepwise

Predictive Analytics Module > Linear Regression > Stepwise

In This Topic

Method
Potential terms
Alpha to enter and remove
Criterion
Specify validation for Forward selection with validation
Hierarchy
Display the table of model selection details
Display the graph of R-squared vs step

Method

Stepwise removes and adds terms to the model for the purpose of identifying a useful subset of the terms. If you choose a stepwise procedure, the terms that you specify in the Model dialog box are candidates for the final model. For more information, go to Using stepwise regression and best subsets regression.

Specify the method that Minitab uses to fit the model.

None: Fit the model with all of the terms that you specify in the Model dialog box.
Stepwise: This method starts with an empty model, or includes the terms you specified to include in the initial model or in every model. Then, Minitab adds or removes a term for each step. You can specify terms to include in the initial model or to force into every model. Minitab stops when all variables not in the model have p-values that are greater than the specified Alpha to enter value and when all variables in the model have p-values that are less than or equal to the specified Alpha to remove value.
Forward selection: This method starts with an empty model, or includes the terms you specified to include in the initial model or in every model. Then, Minitab adds the most significant term for each step. Minitab stops when all variables not in the model have p-values that are greater than the specified Alpha to enter value.
Backward elimination: This method starts with all potential terms in the model and removes the least significant term for each step. Minitab stops when all variables in the model have p-values that are less than or equal to the specified Alpha to remove value.
Forward information criteria: The forward information criteria procedure adds the term with the lowest p-value to the model at each step. Additional terms can enter the model in 1 step if the settings for the analysis allow consideration of non-hierarchical terms but require each model to be hierarchical. Minitab calculates the information criteria for each step. In most cases, the procedure continues until one of the following conditions occurs:
- The procedure does not find an improvement of the criterion for 8 consecutive steps.
- The procedure fits the full model.
- The procedure fits a model that leaves 1 degree of freedom for error.
If you specify settings for the procedure that require a hierarchical model at each step and allow only one term to enter at a time, then the procedure continues until it either fits the full model or fits a model that leaves 1 degree of freedom for error. Minitab displays the results of the analysis for the model with the minimum value of the selected information criterion, either AICc or BIC.
Forward selection with validation: The forward selection with validation procedure depends on the validation method. When you use a test data set, the procedure is similar to forward selection. At the end of each step, Minitab calculates the test R² statistic. At the end of the forward selection procedure, the model with the greatest test R² value is the final model.
With cross-validation, the procedure repeats forward selection on each fold. The procedure evaluates all the folds at each step and identifies the step with the best k-fold stepwise R² value. The last part of the procedure is to perform forward selection on the full dataset, stopping at the best step from the selections on the folds.

For both types of validation, the procedure stops under the same conditions as the forward information criteria procedure.

Note

The terms that are included in the final model can depend on hierarchy restrictions for models. For more information, see the topic on Hierarchy below.

Potential terms

Displays the set of terms that the procedure will assess. Indicators (E or I) next to the term in the list signify how the procedure handles the term. The Method you choose determines the initial settings in this list. You can modify how the procedure handles the terms with the two buttons below. If you don't use these buttons, the procedure can add or remove the term from the model based on its p-value.

E = Include term in every model: Select a term and click this button to force the term into every model regardless of its p-value. Click the button again to remove this condition.
I = Include term in the initial model: Select a term and click this button to include the term in the initial model. The procedure can remove these terms if its p-value is too high. Click the button again to remove this condition. This button is only available if you choose Stepwise in Method.

Alpha to enter and remove

Alpha to enter: Enter the alpha value that Minitab uses to determine whether a term can be entered into the model. You can set this value when you choose Stepwise or Forward selection in Method.
Alpha to remove: Enter the alpha value that Minitab uses to determine whether a term is removed from the model. You can set this value when you choose the Stepwise or Backward elimination in Method.

Criterion

Specify which information criterion to use in forward selection.

Both AICc and BIC assess the likelihood of the model and then apply a penalty for adding terms to the model. The penalty reduces the tendency to overfit the model to the sample data. This reduction can yield a model that performs better in general.

As a general guideline, when the number of parameters is small relative to the sample size, BIC has a larger penalty for the addition of each parameter than AICc. In these cases, the model that minimizes BIC tends to be smaller than the model that minimizes AICc.

In some common cases, such as screening designs, the number of parameters is usually large relative to the sample size. In these cases, the model that minimizes AICc tends to be smaller than the model that minimizes BIC. For example, for a 13-run definitive screening design, the model that minimizes AICc will tend to be smaller than the model that minimizes BIC among the set of models with 6 or more parameters.

For more information on AICc and BIC, see Burnham and Anderson.¹

Specify validation for Forward selection with validation

Note

Validation settings are also in the Validation subdialog box. If you change the settings, Minitab automatically updates the settings in both places.

When you select Forward selection with validation, choose the validation method to test your model. Usually, with smaller samples, the K-fold cross-validation method is appropriate. With larger samples, you can divide the data into a training data set and a test data set.

K-fold cross-validation

Complete the following steps to use K-fold cross validation.

From the drop-down list, select K-fold cross-validation.
Choose one of the following to specify whether to assign folds randomly or with an ID column.
- Randomly assign rows of each fold: Select this option to have Minitab randomly select rows for each fold. You can specify the number of folds. The default value of 10 works well in most cases. Using a lower value of K may introduce more bias; however larger values of K may introduce more variability. You can also set a base for the random number generator.
- Assign rows of each fold by ID column: Select this option to choose the rows to include in each fold. In ID column, enter the column that identifies the folds. Each row with the same value in the ID column is in the same fold.

Validation with a test set

Complete the following steps to divide the data into a training data set and a test data set.

From the drop-down list, select Validation with a test set.
Choose one of the following to specify whether to select a fraction of rows randomly or to select a fraction of rows with an ID column.
- Randomly select a fraction of rows as a test set: Select this option to have Minitab randomly select the test data set. You can specify how much data to use in the test data set. The default value of 0.3 works well in most cases. You want to include enough data in the test data set to evaluate the model well. If you are unsure about the form of the model, a larger test data set provides stronger validation. You also want enough data in the training data set to estimate the model well. Typically, models with more predictors require more training data to estimate.
- Define training/test split by ID column: Select this option to select the rows to include in the test data set yourself. In ID column, enter the column that indicates which rows to use for the test sample. The ID column must contain only 2 values. In Level for test set, select which level to use as the test sample.

Hierarchy

You can determine how Minitab enforces model hierarchy during a stepwise procedure. The Hierarchy button is disabled if you specify a non-hierarchical model in the Model dialog box.

In a hierarchical model, all lower-order terms that comprise the higher-order terms also appear in the model. For example, a model that includes the interaction term A*B*C is hierarchical if it includes these terms: A, B, C, A*B, A*C, and B*C.

Models can be non-hierarchical. Generally, you can remove lower order terms if they are insignificant, unless subject area knowledge suggests that you include them. Models that contain too many terms can be relatively imprecise and can reduce the ability to predict the values of new observations.

Consider the following tips:

Fit a hierarchical model first. You can remove insignificant terms later.
If you standardize your continuous predictors, fit a hierarchical model to produce an equation in uncoded (or natural) units.
If your model contains categorical variables, the results are easier to interpret if the categorical terms, at least, are hierarchical.

Hierarchical model

Choose whether the stepwise procedure must produce a hierarchical model.

Require a hierarchical model at each step: Minitab can only add or remove terms that maintain hierarchy.
Add terms at the end to make the model hierarchical: Initially, Minitab follows the standard rules of the stepwise procedure. At the final step, Minitab adds the terms that produce a hierarchical model, even if their p-values are greater than the Alpha to enter value. If you select this option when the Method is Forward information criteria, Minitab displays an error. To get a hierarchical model that minimizes the criterion among the models in the steps, select Require a hierarchical model at each step.
Do not require a hierarchical model: The final model can be non-hierarchical. Minitab will add and remove terms based only on the rules of the stepwise procedure.

Require hierarchy for the following terms

If you require a hierarchical model, choose the types of terms that must be hierarchical.

All terms: Terms that include continuous and/or categorical variables must be hierarchical.
Terms with categorical predictors: Only terms that include categorical variables must be hierarchical.

How many terms can enter at each step

If you require hierarchy at each step, choose the number of terms that Minitab can add at each step in order to maintain hierarchy.

At most one term can enter at each step: A higher-order term can enter the model only if hierarchy is maintained when adding that single term. All lower-order terms that comprise the higher-order must already be in the model.
Extra terms can enter to maintain hierarchy: A higher-order term can enter the model even if it produces a non-hierarchical model. However, the terms that are necessary to produce a hierarchical model are also added, even if their p-values are greater than the Alpha to enter value.

Display the table of model selection details

Specify the information to display about the stepwise procedure.

Details about the method: Display the type of stepwise procedure and the alpha values to enter and/or remove a predictor from the model.
Include details for each step: Display the coefficients, p-values, and model summary statistics for each step of the procedure.

Display the graph of R-squared vs step

When you choose Forward selection with validation, display a plot of the training and validation R² values for each step in the forward selection. Typically, you use the plot to determine whether simpler models have similar validation values.

¹ Burnham, K. P., & Anderson, D. R. (2004). Multimodel inference: Understanding AIC and BIC in model selection. Sociological Methods & Research, 33(2), 261-304. doi:10.1177/0049124104268644