Select the method or formula of your choice.

The Mean Square of the error (also abbreviated as MS Error or MSE, and denoted as s^{2}) is the variance around the fitted regression line. The formula is:

Term | Description |
---|---|

y_{i} | i^{th} observed response value |

i^{th} fitted response | |

n | number of observations |

p | number of coefficients in the model, not counting the constant |

The formula for the Mean Square (MS) of the regression is:

Term | Description |
---|---|

mean response | |

i^{th} fitted response | |

p | number of terms in the model |

The formula for the total Mean Square (MS) is:

Term | Description |
---|---|

mean response | |

y_{i} | i^{th} observed response value |

n | number of observations |

The sum of the squared distances. SS Regression is the portion of the variation explained by the model. SS Error is the portion not explained by the model and is attributed to error. SS Total is the total variation in the data.

SS Regression:

SS Error:

SS Total:

Term | Description |
---|---|

y _{i} | i ^{th} observed response value |

i ^{th} fitted response | |

mean response |

The formula for the coefficient or slope in simple linear regression is:

The formula for the intercept (*b*_{0}) is:

In matrix terms, the formula that calculates the vector of coefficients in multiple regression is:

**b** = (**X'X**)^{-1}**X'y**

Term | Description |
---|---|

y_{i} | i^{th} observed response value |

mean response | |

x_{i} | i^{th} predictor value |

mean predictor | |

X | design matrix |

y | response matrix |

Term | Description |
---|---|

SSE_{p} | sum of squared errors for the model under consideration |

MSE_{m} | mean square error for the model with all predictors |

n | number of observations |

p | number of terms in the model, including the constant |

The degrees of freedom for each component of the model are:

Sources of variation | DF |
---|---|

Regression | p |

Error | n – p – 1 |

Total | n – 1 |

If your data meet certain criteria and the model includes at least one continuous predictor or more than one categorical predictor, then Minitab uses some degrees of freedom for the lack-of-fit test. The criteria are as follows:

- The data contain multiple observations with the same predictor values.
- The data contain the correct points to estimate additional terms that are not in the model.

Term | Description |
---|---|

n | number of observations |

p | number of coefficients in the model, not counting the constant |

Term | Description |
---|---|

fitted value | |

x_{k} | k^{th} term. Each term can be a single predictor, a polynomial term, or an interaction term. |

b_{k} | estimate of k^{th} regression coefficient |

The formulas for the F-statistics are as follows:

- F(Regression)
- F(Term)
- F(Lack-of-fit)

Term | Description |
---|---|

MS Regression | A measure of the variation in the response that the current model explains. |

MS Error | A measure of the variation that the model does not explain. |

MS Term | A measure of the amount of variation that a term explains after accounting for the other terms in the model. |

MS Lack-of-fit | A measure of variation in the response that could be modeled by adding more terms to the model. |

MS Pure error | A measure of the variation in replicated response data. |

The two-sided p-value for the null hypothesis that a regression coefficient equals 0 is:

The degrees of freedom are the degrees of freedom for error, as follows:

*n* – *p* – 1

Term | Description |
---|---|

The cumulative distribution function of the t distribution with degrees of freedom equal to the degrees of freedom for error. | |

t_{j} | The t statistic for the j^{th} coefficient. |

n | The number of observations in the data set. |

p | The sum of the degrees of freedom for the terms. The terms do not include the constant. |

This p-value is for the test of the null hypothesis that all of the coefficients that are in the model equal zero, except for the constant coefficient. The p-value is a probability that is calculated from an F-distribution with the degrees of freedom (DF) as follows:

- Numerator DF
- sum of the degrees of freedom for the term or the terms in the test
- Denominator DF
- degrees of freedom for error

1 − P(*F* ≤ *f _{j}*)

Term | Description |
---|---|

P(F ≤ f)_{j} | cumulative distribution function for the F-distribution |

f_{j} | f-statistic for the test |

For a model with multiple predictors, the equation is:

*y* = *β*_{0} + *β*_{1}*x*_{1} + … + *β _{k}x_{k}* +

The fitted equation is:

In simple linear regression, which includes only one predictor, the model is:

*y*=*ß*_{0}+ *ß*_{1}*x*_{1}+*ε*

Using regression estimates *b*_{0} for *ß*_{0}, and *b*_{1} for *ß*_{1}, the fitted equation is:

Term | Description |
---|---|

y | response |

x_{k} | k^{th} term. Each term can be a single predictor, a polynomial term, or an interaction term. |

ß_{k} | k^{th} population regression coefficient |

ε | error term that follows a normal distribution with a mean of 0 |

b_{k} | estimate of k^{th} population regression coefficient |

fitted response |

Term | Description |
---|---|

e_{i} | i ^{th} residual |

i ^{th} observed response value | |

i ^{th} fitted response |

R^{2} is also known as the coefficient of determination.

Term | Description |
---|---|

y _{i} | i ^{th} observed response value |

mean response | |

i ^{th} fitted response |

While the calculations for adjusted R^{2} can produce negative values, Minitab displays zero for these cases.

Term | Description |
---|---|

i^{th} observed response value | |

i^{th} fitted response | |

mean response | |

n | number of observations |

p | number of terms in the model |

While the calculations for R^{2}(pred) can produce negative values, Minitab displays zero for these cases.

Term | Description |
---|---|

y _{i} | i ^{th} observed response value |

mean response | |

n | number of observations |

e _{i} | i ^{th} residual |

h _{i} | i ^{th} diagonal element of X(X'X)^{–1}X' |

X | design matrix |

Term | Description |
---|---|

MSE | mean square error |

For simple linear regression, the standard error of the coefficient is:

The standard errors of the coefficients for multiple regression are the square roots of the diagonal elements of this matrix:

Term | Description |
---|---|

x_{i} | i^{th} predictor value |

mean of the predictor | |

X | design matrix |

X' | transpose of the design matrix |

s^{2} | mean square error |

Standardized residuals are also called "internally Studentized residuals."

Term | Description |
---|---|

e _{i} | i ^{th} residual |

h _{i} | i ^{th} diagonal element of X(X'X)^{–1}X' |

s ^{2} | mean square error |

X | design matrix |

X' | transpose of the design matrix |

Term | Description |
---|---|

t_{j} | test statistic for the j^{th} coefficient |

j^{th} estimated coefficient | |

standard error of the j^{th} estimated coefficient |

Minitab calculates the VIF by regressing each predictor on the remaining predictors and noting the R^{2}value.

For predictor *x _{j}*, the VIF is:

Term | Description |
---|---|

R^{2}( x)_{j} | coefficient of determination with x as the response variable and the other terms in the model as the predictors_{j} |