In matrix terms, these are the formulas for the different sums of squares:
Minitab breaks down the SS Regression or SS Treatments component into the amount of variation explained by each term using both the sequential sum of squares and adjusted sum of squares.
Term | Description |
---|---|
b | vector of coefficients |
X | design matrix |
Y | vector of response values |
n | number of observations |
J | n by n matrix of 1s |
Minitab breaks down the SS Regression or Treatments component of variance into sequential sums of squares for each factor. The sequential sums of squares depend on the order the factors or predictors are entered into the model. The sequential sum of squares is the unique portion of SS Regression explained by a factor, given any previously entered factors.
For example, if you have a model with three factors or predictors, X1, X2, and X3, the sequential sum of squares for X2 shows how much of the remaining variation X2 explains, given that X1 is already in the model. To obtain a different sequence of factors, repeat the analysis and enter the factors in a different order.
The degrees of freedom for each component of the model are:
Sources of variation | DF |
---|---|
Regression | p |
Error | n – p – 1 |
Total | n – 1 |
Term | Description |
---|---|
n | number of observations |
p | number of coefficients in the model, not counting the constant |
The formula for the Mean Square (MS) of the regression is:
Term | Description |
---|---|
mean response | |
ith fitted response | |
p | number of terms in the model |
The Mean Square of the error (also abbreviated as MS Error or MSE, and denoted as s2) is the variance around the fitted regression line. The formula is:
Term | Description |
---|---|
yi | ith observed response value |
ith fitted response | |
n | number of observations |
p | number of coefficients in the model, not counting the constant |
The formula for the total Mean Square (MS) is:
Term | Description |
---|---|
mean response | |
yi | ith observed response value |
n | number of observations |
The formulas for the F-statistics are as follows:
Term | Description |
---|---|
MS Regression | A measure of the variation in the response that the current model explains. |
MS Error | A measure of the variation that the model does not explain. |
MS Term | A measure of the amount of variation that a term explains after accounting for the other terms in the model. |
MS Lack-of-fit | A measure of variation in the response that could be modeled by adding more terms to the model. |
MS Pure error | A measure of the variation in replicated response data. |
The p-value is a probability that is calculated from an F-distribution with the degrees of freedom (DF) as follows:
1 − P(F ≤ fj)
Term | Description |
---|---|
P(F ≤ f) | cumulative distribution function for the F-distribution |
f | f-statistic for the test |
where n = number of observations and m = number of distinct x-level combinations
Large F-values and small p-values suggest that the model is inadequate.
1 − P(F ≤ fj)
Term | Description |
---|---|
P(F ≤ fj) | cumulative distribution function for the F-distribution |
fj | f-statistic for the test |