In best subsets regression, by default, Minitab Express selects the model with the highest R^{2} values that contain one predictor, two predictors, and so on. You can determine which predictors are included in each model based on which columns in the output table are marked with an "X".
Use the goodnessoffit statistics to determine which model provides the best fit to your data. Before you select a final model, you should examine residual plots and other diagnostic measures to ensure that the model meets the assumptions of the analysis.
R^{2} is the percentage of variation in the response that is explained by the model. The higher the R^{2} value, the better the model fits your data. R^{2} is always between 0% and 100%.
R^{2} always increases when you add additional predictors to a model. For example, the best fivepredictor model will always have an R^{2} that is at least as high the best fourpredictor model. Therefore, R^{2} is most useful when you compare models of the same size.
Use adjusted R^{2} when you want to compare models that have different numbers of predictors. R^{2} always increases when you add a predictor to the model, even when there is no real improvement to the model. The adjusted R^{2} value incorporates the number of predictors in the model to help you choose the correct model.
Use predicted R^{2} to determine how well your model predicts the response for new observations. Models that have larger predicted R^{2} values have better predictive ability.
A predicted R^{2} that is substantially less than R^{2} may indicate that the model is overfit. An overfit model occurs when you add terms for effects that are not important in the population, although they may appear important in the sample data. The model becomes tailored to the sample data and therefore, may not be useful for making predictions about the population.
Predicted R^{2} can also be more useful than adjusted R^{2} for comparing models because it is calculated with observations that are not included in the model calculation.
Use S to assess how well the model describes the response. Use S instead of the R^{2} statistics to compare the fit of models that have no constant.
S is measured in the units of the response variable and represents the standard deviation of how far the data values fall from the fitted values. The lower the value of S, the better the model describes the response. However, a low S value by itself does not indicate that the model meets the model assumptions. You should check the residual plots to verify the assumptions.
Small samples do not provide a precise estimate of the strength of the relationship between the response and predictors. If you need R^{2} to be more precise, you should use a larger sample (typically, 40 or more).
R^{2} is just one measure of how well the model fits the data. Even when a model has a high R^{2}, you should check the residual plots to verify that the model meets the model assumptions.
 

In these results, there are several models to examine further. The model with all 5 predictors has the lowest value of S and the highest value of adjusted R^{2}, approximately 8 and 88 respectively. A model with 2 predictors has the highest predicted R^{2} value of 81.4%. Before you select the final model, you should examine the models for violations of the regression assumptions using residual plots and other diagnostic measures.