Analysis of variance table for Fit General Linear Model

Find definitions and interpretation guidance for every statistic in the Analysis of Variance table.

In This Topic

DF
Adj SS
Adj MS
Seq SS
Seq MS
Contribution
F-value
P-Value – Term
P-value – Lack-of-fit

DF

The total degrees of freedom (DF) are the amount of information in your data. The analysis uses that information to estimate the values of unknown population parameters. The total DF is determined by the number of observations in your sample. The DF for a term show how much information that term uses. Increasing your sample size provides more information about the population, which increases the total DF. Increasing the number of terms in your model uses more information, which decreases the DF available to estimate the variability of the parameter estimates.

If two conditions are met, then Minitab partitions the DF for error. The first condition is that there must be terms you can fit with the data that are not included in the current model. For example, if you have a continuous predictor with 3 or more distinct values, you can estimate a quadratic term for that predictor. If the model does not include the quadratic term, then a term that the data can fit is not included in the model and this condition is met.

The second condition is that the data contain replicates. Replicates are observations where each predictor has the same value. For example, if you have 3 observations where pressure is 5 and temperature is 25, then those 3 observations are replicates.

If the two conditions are met, then the two parts of the DF for error are lack-of-fit and pure error. The DF for lack-of-fit allow a test of whether the model form is adequate. The lack-of-fit test uses the degrees of freedom for lack-of-fit. The more DF for pure error, the greater the power of the lack-of-fit test.

Adj SS

Adjusted sums of squares are measures of variation for different components of the model. The order of the predictors in the model does not affect the calculation of the adjusted sums of squares. In the Analysis of Variance table, Minitab separates the sums of squares into different components that describe the variation due to different sources.

Adj SS Term: The adjusted sum of squares for a term is the increase in the regression sum of squares compared to a model with only the other terms. It quantifies the amount of variation in the response data that is explained by each term in the model.
Adj SS Error: The error sum of squares is the sum of the squared residuals. It quantifies the variation in the data that the predictors do not explain.
Adj SS Total: The total sum of squares is the sum of the term sum of squares and the error sum of squares. It quantifies the total variation in the data.

Interpretation

Minitab uses the adjusted sums of squares to calculate the p-value for a term. Minitab also uses the sums of squares to calculate the R² statistic. Usually, you interpret the p-values and the R² statistic instead of the sums of squares.

Adj MS

Adjusted mean squares measure how much variation a term or a model explains, assuming that all other terms are in the model, regardless of the order they were entered. Unlike the adjusted sums of squares, the adjusted mean squares consider the degrees of freedom.

The adjusted mean square of the error (also called MSE or s²) is the variance around the fitted values.

Interpretation

Minitab uses the adjusted mean squares to calculate the p-value for a term. Minitab also uses the adjusted mean squares to calculate the adjusted R² statistic. Usually, you interpret the p-values and the adjusted R² statistic instead of the adjusted mean squares.

Seq SS

Sequential sums of squares are measures of variation for different components of the model. Unlike the adjusted sums of squares, the sequential sums of squares depend on the order the terms are entered into the model. In the Analysis of Variance table, Minitab separates the sequential sums of squares into different components that describe the variation due to different sources.

Seq SS Term: The sequential sum of squares for a term is the unique portion of the variation explained by a term that is not explained by the previously entered terms. It quantifies the amount of variation in the response data that is explained by each term as it is sequentially added to the model.
Seq SS Error: The error sum of squares is the sum of the squared residuals. It quantifies the variation in the data that the predictors do not explain.
Seq SS Total: The total sum of squares is the sum of the term sums of squares and the error sum of squares. It quantifies the total variation in the data.

Interpretation

By default, the adjusted sums of squares are used to calculate the p-value for a term. When appropriate, you can calculate the p-value for a term from the sequential sum of squares. Usually, you interpret the p-values instead of the sums of squares.

Seq MS

Sequential mean squares measure how much variation a term or a model explains. The sequential mean squares depend on the order the terms are entered into the model. Unlike sequential sums of squares, sequential mean squares consider the degrees of freedom.

The sequential mean square error (also called MSE or s²) is the variance around the fitted values.

Interpretation

Minitab uses the sequential mean squares to calculate the p-value for a term. Minitab also uses the sequential mean squares to calculate the adjusted R² statistic. Usually, you interpret the p-values and the adjusted R² statistic instead of the sequential mean squares.

Contribution

Contribution displays the percentage that each source in the Analysis of Variance table contributes to the total sequential sums of squares (Seq SS).

Interpretation

Higher percentages indicate that the source accounts for more of the variation in the response.

F-value

An F-value appears for each term in the Analysis of Variance table:

F-value for the model or the terms: The F-value is the test statistic used to determine whether the term is associated with the response.
F-value for the lack-of-fit test: The F-value is the test statistic used to determine whether the model is missing higher-order terms that include the predictors in the current model.

Interpretation

Minitab uses the F-value to calculate the p-value, which you use to make a decision about the statistical significance of the terms and model. The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.

A sufficiently large F-value indicates that the term or model is significant.

If you want to use the F-value to determine whether to reject the null hypothesis, compare the F-value to your critical value. You can calculate the critical value in Minitab or find the critical value from an F-distribution table in most statistics books. For more information on using Minitab to calculate the critical value, go to Using the inverse cumulative distribution function (ICDF) and click "Use the ICDF to calculate critical values".

P-Value – Term

The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.

Interpretation

To determine whether the association between the response and each term in the model is statistically significant, compare the p-value for the term to your significance level to assess the null hypothesis. The null hypothesis is that there is no association between the term and the response. Usually, a significance level (denoted as α or alpha) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that an association exists when there is no actual association.

P-value ≤ α: The association is statistically significant: If the p-value is less than or equal to the significance level, you can conclude that there is a statistically significant association between the response variable and the term.
P-value > α: The association is not statistically significant: If the p-value is greater than the significance level, you cannot conclude that there is a statistically significant association between the response variable and the term. You may want to refit the model without the term.; If there are multiple predictors without a statistically significant association with the response, you can reduce the model by removing terms one at a time. For more information on removing terms from the model, go to Model reduction.

If a model term is statistically significant, the interpretation depends on the type of term. The interpretations are as follows:

If a fixed factor is significant, you can conclude that not all the level means are equal.
If a random factor is significant, you can conclude that the factor contributes to the amount of variation in the response.
If an interaction term is significant, the relationship between a factor and the response depends on the other factors in the term. In this case, you should not interpret the main effects without considering the interaction effect.
If a covariate is statistically significant, you can conclude that changes in the value of the covariate are associated with changes in the mean response value.
If a polynomial term is significant, you can conclude that the data contain curvature.

P-value – Lack-of-fit

The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis. Minitab automatically performs the pure error lack-of-fit test when your data contain replicates, which are multiple observations with identical x-values. Replicates represent "pure error" because only random variation can cause differences between the observed response values.

Interpretation

To determine whether the model correctly specifies the relationship between the response and the predictors, compare the p-value for the lack-of-fit test to your significance level to assess the null hypothesis. The null hypothesis for the lack-of-fit test is that the model correctly specifies the relationship between the response and the predictors. Usually, a significance level (denoted as alpha or α) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that the model does not correctly specify the relationship between the response and the predictors when the model does specify the correct relationship.

P-value ≤ α: The lack-of-fit is statistically significant: If the p-value is less than or equal to the significance level, you conclude that the model does not correctly specify the relationship. To improve the model, you may need to add terms or transform your data.
P-value > α: The lack-of-fit is not statistically significant: If the p-value is larger than the significance level, the test does not detect any lack-of-fit.