Methods and formulas for Best Subsets Regression

Computational routine

In best subsets regression, Minitab uses a procedure called the Hamiltonian Walk, which is a method for calculating all possible subsets of predictors, one subset per step. That is, Minitab calculates all 2**m - 1 subsets in 2**m - 1 steps, where m is the number of predictors in the model. Minitab evaluates a different subset regression at each step.

Each subset in the Hamiltonian Walk differs from the preceding subset by the addition or deletion of only one variable. The sweep operator "sweeps" a variable in or out of the regression on each step of the Hamiltonian Walk, and calculates the R2 for each subset.

Regression equation

For a model with multiple predictors, the equation is:

y = β0 + β1x1 + … + βkxk + ε

The fitted equation is:

In simple linear regression, which includes only one predictor, the model is:

y=ß0+ ß1x1+ε

Using regression estimates b0 for ß0, and b1 for ß1, the fitted equation is:

Notation

TermDescription
yresponse
xkkth term. Each term can be a single predictor, a polynomial term, or an interaction term.
ßkkth population regression coefficient
εerror term that follows a normal distribution with a mean of 0
bkestimate of kth population regression coefficient
fitted response

R-sq

R2 is also known as the coefficient of determination.

Formula

Notation

TermDescription
yi i th observed response value
mean response
i th fitted response

R-sq (adj)

Notation

TermDescription
MSMean Square
SSSum of Squares
DFDegrees of Freedom

PRESS

Assesses your model's predictive ability and is calculated as:

Notation

TermDescription
nnumber of observations
eiith residual
hi

ith diagonal element of

X (X' X)-1X'

R-sq (pred)

While the calculations for R2(pred) can produce negative values, Minitab displays zero for these cases.

Notation

TermDescription
yi i th observed response value
mean response
n number of observations
ei i th residual
hi i th diagonal element of X(X'X)–1X'
X design matrix

Mallows' Cp

Notation

TermDescription
SSEpsum of squared errors for the model under consideration
MSEmmean square error for the model with all predictors
nnumber of observations
pnumber of terms in the model, including the constant

S

Notation

TermDescription
MSEmean square error
By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy