Model reduction lets you simplify a regression model by eliminating insignificant terms. Reducing the number of terms can make the model easier to work with. If left in the model, insignificant terms can reduce the precision of the predictors.

A model can be reduced manually. For many models, you can determine which terms are statistically significant by examining the p-value of each coefficient.

Technicians measure the total heat flux as part of a solar thermal energy test. An energy engineer wants to determine how total heat flux is predicted by other variables: insolation, the position of the focal points in the east, south, and north directions, and the time of day. Using the full regression model, they determine the following relationship between heat flux and the variables.

Regression Equation
Heat Flux = 325.4 + 2.55 East + 3.80 South - 22.95 North + 0.0675 Insolation
+ 2.42 Time of Day

Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant 325.4 96.1 3.39 0.003
East 2.55 1.25 2.04 0.053 1.36
South 3.80 1.46 2.60 0.016 3.18
North -22.95 2.70 -8.49 0.000 2.61
Insolation 0.0675 0.0290 2.33 0.029 2.32
Time of Day 2.42 1.81 1.34 0.194 5.37

The p-value for Time of Day (0.194) is the highest p-value that is greater than 0.05, so they reduce the model by removing this term. They repeat the regression, removing one insignificant term each time, until only statistically significant terms remain. The final reduced model is:

Regression Equation
Heat Flux = 483.7 + 4.796 South - 24.22 North

Coefficients
Term Coef SE Coef T-Value P-Value VIF
Constant 483.7 39.6 12.22 0.000
South 4.796 0.951 5.04 0.000 1.09
North -24.22 1.94 -12.48 0.000 1.09

Although statistical techniques can be helpful for reducing the model, you should also use process knowledge and good judgment to decide which terms to eliminate. Some terms might be essential, other terms may be too costly or too difficult to measure. In the previous example, the engineer wants to eliminate as many insignificant terms as they safely can, so that they can estimate total heat flux with as few variables as possible.