In best subsets regression, Minitab uses a procedure called the Hamiltonian Walk, which is a method for calculating all possible subsets of predictors, one subset per step. That is, Minitab calculates all 2**m - 1 subsets in 2**m - 1 steps, where m is the number of predictors in the model. Minitab evaluates a different subset regression at each step.
Each subset in the Hamiltonian Walk differs from the preceding subset by the addition or deletion of only one variable. The sweep operator "sweeps" a variable in or out of the regression on each step of the Hamiltonian Walk, and calculates the R^{2} for each subset.
For a model with multiple predictors, the equation is:
y = β_{0} + β_{1}x_{1} + … + β_{k}x_{k} + ε
The fitted equation is:
In simple linear regression, which includes only one predictor, the model is:
y=ß_{0}+ ß_{1}x_{1}+ε
Using regression estimates b_{0} for ß_{0}, and b_{1} for ß_{1}, the fitted equation is:
Term | Description |
---|---|
y | response |
x_{k} | k^{th} term. Each term can be a single predictor, a polynomial term, or an interaction term. |
ß_{k} | k^{th} population regression coefficient |
ε | error term that follows a normal distribution with a mean of 0 |
b_{k} | estimate of k^{th} population regression coefficient |
fitted response |
R^{2} is also known as the coefficient of determination.
Term | Description |
---|---|
y_{i} | i ^{th} observed response value |
mean response | |
i ^{th} fitted response |
Term | Description |
---|---|
MS | Mean Square |
SS | Sum of Squares |
DF | Degrees of Freedom |
Term | Description |
---|---|
n | number of observations |
e_{i} | i^{th} residual |
h_{i} | i^{th} diagonal element of X (X' X)^{-1}X' |
While the calculations for R^{2}(pred) can produce negative values, Minitab displays zero for these cases.
Term | Description |
---|---|
y_{i} | i ^{th} observed response value |
mean response | |
n | number of observations |
e_{i} | i ^{th} residual |
h_{i} | i ^{th} diagonal element of X(X'X)^{–1}X' |
X | design matrix |
Term | Description |
---|---|
SSE_{p} | sum of squared errors for the model under consideration |
MSE_{m} | mean square error for the model with all candidate terms |
n | number of observations |
p | number of terms in the model, including the constant |
Term | Description |
---|---|
MSE | mean square error |
Observations with weights of 0 are not in the analysis.
Term | Description |
---|---|
n | the number of observations |
R | the sum of squares for error for the model |
w_{i} | the weight of the i^{th} observation |
AICc is not calculated when .
Term | Description |
---|---|
n | the number of observations |
p | the number of coefficients in the model, including the constant |
Term | Description |
---|---|
p | the number of coefficients in the model, including the constant |
n | the number of observations |
Term | Description |
---|---|
C | the condition number |
λ_{maximum} | the maximum eigenvalue from the correlation matrix of the terms in the model, not including the intercept |
λ_{minimum} | the minimum eigenvalue from the correlation matrix of the terms in the model, not including the intercept |