Model reduction is the elimination of terms from the model, such as the term for a predictor variable or the interaction between predictor variables. Model reduction lets you simplify a model and increase the precision of predictions. You can reduce models in any group of commands in Minitab, including regression, ANOVA, DOE, and reliability.
One criterion for model reduction is the statistical significance of a term. The elimination of statistically insignificant terms increases the precision of predictions from the model. To use the statistical significance criterion, first choose a significance level such as 0.05 or 0.15. Then, try different terms to find a model with as many statistically significant terms as possible but with no statistically insignificant terms. To use the statistical significance criterion, the data must provide enough degrees of freedom to estimate statistical significance after you fit the model. You can apply the statistical significance criterion manually, or automatically with an algorithmic procedure, such as stepwise regression. The purpose of the statistical significance criterion is to find a model that meets your goals. However, the statistical significance criterion does always produce the one best model.
Besides the statistical significance criterion, other statistical criteria that Minitab calculates for models include S, adjusted R2, predicted R2, PRESS, Mallows' Cp, and the Akaike Information Criterion (AIC). You can consider one or more of these criteria when you reduce a model.
Like stepwise regression, best subsets regression is an algorithmic procedure you can use to find a model that meets your goals. Best subsets regression examines all models and identifies the models that have the highest R2 values. In Minitab, best subsets regression also displays other statistics, such as adjusted R2 and predicted R2. You can consider these statistics when you compare models. Because best subsets uses R2, the models that best subsets regression identifies as the best models might or might not have only statistically significant terms. Other statistical criteria to consider as you reduce a model include multicollinearity and hierarchy. These two concepts are discussed in more detail below.
Statistics that measure how well the model fits the data can help you find a useful model. However, you should also use process knowledge and good judgment to decide which terms to eliminate. Some terms might be essential, whereas other terms might be too costly or too difficult to measure.
Technicians measure the total heat flux as part of a solar thermal energy test. An energy engineer wants to determine how total heat flux is predicted by other variables: insolation, the position of the focal points in the east, south, and north directions, and the time of day. Using the full regression model, the engineer determines the following relationship between heat flux and the variables.
The engineer wants to eliminate as many insignificant terms as possible to maximize the precision of predictions. The engineer decides to use 0.05 as the threshold for statistical significance. The p-value for Time of Day (0.194) is the highest p-value that is greater than 0.05, so the engineer removes this term first. The engineer repeats the regression, removing one insignificant term each time, until only statistically significant terms remain. The final reduced model is as follows:
Multicollinearity in regression is a condition that occurs when some predictor variables in the model are correlated with other predictor variables. Severe multicollinearity is problematic because it can increase the variance of the regression coefficients, making them unstable. When you remove a term that has high multicollinearity, the statistical significance and values of the coefficients of highly correlated terms can change considerably. Thus, in the presence of multicollinearity, examining multiple statistics and changing the model one term at a time are even more important. Usually, you reduce as much multicollinearity as possible before you reduce a model. For more information on ways to reduce multicollinearity, go to Multicollinearity in regression.
A team at a medical facility develops a model to predict patient satisfaction scores. The model has several variables, including the time patients are with a practitioner and the time patients are in medical tests. With both of these variables in the model, the multicollinearity is high, with VIF (variance inflation factor) values of 8.91. Values greater than 5 usually indicate severe multicollinearity. The p-value for the amount of time that patients are with a practitioner is 0.105, which is not significant at the 0.05 level. The predicted R2 value for this model is 22.9%.
The predicted R2 value for the model with only test time drops from 22.9% to 10.6%. Although the time patients are with a practitioner is not statistically significant at the 0.05 level, including that variable more than doubles the predicted R2 value. The high multicollinearity could be hiding the importance of the predictor.
A hierarchical model is a model where, for each term in the model, all lower order terms are also in the model. For example, suppose a model has four factors: A, B, C, and D. If the term A*B*C is in the model, then the terms A, B, C, A*B, A*C, and B*C must also be in the model. Any terms with D do not have to be in the model because D is not in the term A*B*C. The hierarchical structure applies to nesting as well. If B(A) is in the model, then A must also be in the model for the model to be hierarchical.
Hierarchy is desirable because hierarchical models can be translated from standardized to unstandardized units. Standardized units are common when the model includes higher order terms like interactions because the standardization reduces the multicollinearity that these terms cause.
Because hierarchy is desirable, hierarchical model reduction is common. For example, one strategy is to use the p-value criterion to reduce the model in combination with hierarchy. First, you remove the most complex terms that are statistically insignificant. If a statistically insignificant term is part of an interaction term or a higher-order term, then the term stays in the model. Minitab's stepwise model selection can use the hierarchy criterion and the statistical significance criterion.
A materials engineer for a building products manufacturer is developing a new insulation product. The engineer designs a 2-level full factorial experiment to assess several factors that could affect the insulating value of the insulation. The engineer includes interactions in the model to determine whether the effects of the factors depend on each other. Because interactions create multicollinearity, the engineer codes the predictors to reduce the multicollinearity.
The highest p-value for the first model that the engineer examines is 0.985 for the interaction between injection temperature and material. Below the table of coded coefficients, the engineer can examine the regression equation in uncoded units. The regression equation helps the engineer to understand the size of the effects in the same units as the data.
If the engineer uses only the p-value criterion to reduce the model, then the next model is non-hierarchical because you remove a two-factor interaction that is part of a three-factor interaction. Because the model is nonhierarchical, the uncoded coefficients do not exist. Thus, the regression equation for the non-hierarchical model is in coded units. The coded regression equation does not provide any information about the effects in the same units as the data.
Instead of using only the p-value criterion, the engineer decides to remove the most complex terms that have high p-values first. In this model, instead of removing the term that has the highest p-value, the engineer removes the 3-way interaction that has the highest p-value. The highest p-value for a 3-way interaction is 0.466 for the interaction between injection pressure, injection temperature, and material.