Comparison of best subsets regression and stepwise regression

Best subsets regression provides information about the fit of several different models, thereby letting you select a model based on up to 9 distinct statistics. (In the simple table, Minitab displays 5 statistics.) Stepwise regression produces a single model based on a single statistic. Because different selection criteria are used in each model, it is possible that best subsets regression and stepwise regression will lead to different models. General guidelines on which method to use are as follows:
  • For data sets with a small number of predictors, best subsets regression is better than stepwise regression because it provides information about more models.
  • Best subsets allows you to have 31 free predictors, so for data sets with a large number of predictors, stepwise regression is better than best subsets regression. When using stepwise regression on a data set with a large number of predictors, choose large alpha-to-enter and alpha-to-remove levels (0.25 to 0.50). Large values let you learn more about the effects of each entered predictor on the response and on the predictors already in the model.

Verifying the model

Exercise caution when using variable selection procedures such as best subsets and stepwise regression. These procedures are automatic and, therefore, do not consider the practical importance of any of the predictors. Also, when you fit a model to data, the goodness of the fit comes from two basic sources:
  • The underlying structure of the data (a structure that will apply to other data sets collected in the same way)
  • The peculiarities of the one specific data set you analyze

To ensure that your model doesn't just fit one specific data set, you should verify the model found by the selection procedure on a new set of data. You can also take the original data set, randomly divide it into two parts, use one part to select a model, and then verify the fit on the second part. This procedure helps ensure that the model you select will apply to other data sets.

