Use Mallows' Cp to help you choose between multiple regression models. It helps you strike an important balance with the number of predictors in the model. Mallows' Cp compares the precision and bias of the full model to models with a subset of the predictors.
Usually, you should look for models where Mallows' Cp is small and close to the number of predictors in the model plus the constant (p). A small Mallows' Cp value indicates that the model is relatively precise (has small variance) in estimating the true regression coefficients and predicting future responses. A Mallows' Cp value that is close to the number of predictors plus the constant indicates that the model is relatively unbiased in estimating the true regression coefficients and predicting future responses. Models with lack-of-fit and bias have values of Mallows' Cp larger than p.
Using Mallows' Cp to compare regression models is valid only when you start with the same complete set of predictors.
If any predictor is highly correlated with another predictor, Mallows' Cp is not displayed in the output.
For example, you work for a potato chip company that examines the factors which affect the percentage of crumbled potato chips per container. You include the percentage of potato relative to other ingredients, cooling rate, and cooking temperature as predictors in the regression model.
Step | %Potato | Cooling rate | Cooking temp | Mallows' Cp |
---|---|---|---|---|
1 | X | 7.2 | ||
2 | X | X | 2.9 | |
3 | X | X | X | 5.5 |
The results indicate that the model with the two terms "%Potato" and "Cooling rate" is relatively precise and unbiased because its Mallows' Cp (2.9) is closest to the number of predictors plus the constant (3). You should examine Mallows' Cp in conjunction with other statistics included in the results such as R2, Adjusted R2, and S.