Methods and formulas for Best Subsets Regression

In This Topic

Computational routine
Regression equation
R-sq
R-sq (adj)
PRESS
R-sq (pred)
Mallows' Cp
S
Log-likelihood
AICc (Akaike's Corrected Information Criterion)
BIC (Bayesian Information Criterion)
Condition number

Computational routine

In best subsets regression, Minitab uses a procedure called the Hamiltonian Walk, which is a method for calculating all possible subsets of predictors, one subset per step. That is, Minitab calculates all 2**m - 1 subsets in 2**m - 1 steps, where m is the number of predictors in the model. Minitab evaluates a different subset regression at each step.

Each subset in the Hamiltonian Walk differs from the preceding subset by the addition or deletion of only one variable. The sweep operator "sweeps" a variable in or out of the regression on each step of the Hamiltonian Walk, and calculates the R² for each subset.

Regression equation

For a model with multiple predictors, the equation is:

y = β₀ + β₁x₁ + … + β_kx_k + ε

The fitted equation is:

In simple linear regression, which includes only one predictor, the model is:

y=ß₀+ ß₁x₁+ε

Using regression estimates b₀ for ß₀, and b₁ for ß₁, the fitted equation is:

Notation

Term	Description
y	response
x_k	k^th term. Each term can be a single predictor, a polynomial term, or an interaction term.
ß_k	k^th population regression coefficient
ε	error term that follows a normal distribution with a mean of 0
b_k	estimate of k^th population regression coefficient
	fitted response

R-sq

R² is also known as the coefficient of determination.

Formula

Notation

Term	Description
y_i	i ^th observed response value
	mean response
	i ^th fitted response

R-sq (adj)

Notation

Term	Description
MS	Mean Square
SS	Sum of Squares
DF	Degrees of Freedom

PRESS

Assesses your model's predictive ability and is calculated as:

Notation

Term	Description
n	number of observations
e_i	i^th residual
h_i	i^th diagonal element of X (X' X)^-1X'

Term

Description

number of observations

e_i

i^th residual

h_i

i^th diagonal element of

X (X' X)^-1X'

R-sq (pred)

While the calculations for R²(pred) can produce negative values, Minitab displays zero for these cases.

Notation

Term	Description
y_i	i ^th observed response value
	mean response
n	number of observations
e_i	i ^th residual
h_i	i ^th diagonal element of X(X'X)^–1X'
X	design matrix

Mallows' Cp

Notation

Term	Description
SSE_p	sum of squared errors for the model under consideration
MSE_m	mean square error for the model with all candidate terms
n	number of observations
p	number of terms in the model, including the constant

S

Notation

Term	Description
MSE	mean square error

Log-likelihood

For unweighted analyses, Minitab uses the following equation:

For an analysis that has weights for the observations, Minitab uses the following equation:

Observations with weights of 0 are not in the analysis.

Notation

Term	Description
n	the number of observations
R	the sum of squares for error for the model
w_i	the weight of the i^th observation

AICc (Akaike's Corrected Information Criterion)

AICc is not calculated when .

Notation

Term	Description
n	the number of observations
p	the number of coefficients in the model, including the constant

BIC (Bayesian Information Criterion)

Notation

Term	Description
p	the number of coefficients in the model, including the constant
n	the number of observations

Condition number

Notation

Term	Description
C	the condition number
λ_maximum	the maximum eigenvalue from the correlation matrix of the terms in the model, not including the intercept
λ_minimum	the minimum eigenvalue from the correlation matrix of the terms in the model, not including the intercept