Source | Var | % of Total | SE Var | Z-Value | P-Value |
---|---|---|---|---|---|
Field | 0.077919 | 72.93% | 0.067580 | 1.152996 | 0.124 |
Error | 0.028924 | 27.07% | 0.010562 | 2.738613 | 0.003 |
Total | 0.106843 |
In these results, field is the random term and the p-value for field is 0.124. Because this value is greater than 0.05, you do not have enough evidence to conclude that different fields contribute to the amount of variation in the yield.
To determine whether a term significantly affects the response, compare the p-value to your significance level. Usually, a significance level (denoted as α or alpha) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that an affect exists when there is no actual affect.
The interpretation of each p-value depends on whether it is for the coefficient of a fixed factor term or for a covariate term.
If the p-value is less than or equal to the significance level, you can conclude that the fixed factor term does significantly affect the response. The rejection of the null hypothesis indicates that one level effect is significantly different from the other level effects of the term.
Term | DF Num | DF Den | F-Value | P-Value |
---|---|---|---|---|
Variety | 5.00 | 15.00 | 26.29 | 0.000 |
Variety is the fixed factor term, and the p-value for the variety term is less than 0.000. Because this value is less than 0.05, you can conclude that the level means are not all equal, meaning the variety of alfalfa has an effect on the yield.
To obtain a better understanding of the main effects, go to Factorial Plots.
To determine how well the model fits your data, examine the goodness-of-fit statistics in the Model Summary table.
S is the estimated standard deviation of the error term. The lower the value of S, the better the conditional fitted equation describes the response at the selected factor settings. However, an S value by itself doesn't completely describe model adequacy. Also examine the key results from other tables and the residual plots.
R^{2} is the percentage of variation in the response that is explained by the model. It is calculated as 1 minus the ratio of the error sum of squares (which is the variation that is not explained by model) to the total sum of squares (which is the total variation in the model).
Use adjusted R^{2} when you want to compare models with the same covariance structure but have a different number of fixed factors and covariates. Assuming the models have the same covariance structure, R^{2} increases when you add additional fixed factors or covariates. The adjusted R^{2} value incorporates the number of fixed factors and covariates in the model to help you choose the correct model.
To get more precise and less biased estimates for the parameters in a model, usually, the number of rows in a data set should be much larger than the number of parameters in the model. To get reasonably good estimates for the variance components of the random terms, you should have enough representative levels for each random factor.
Goodness-of-fit statistics are just one measure of how well the model fits the data. Even when a model has a desirable value, you should check the residual plots to verify that the model meets the model assumptions.
S | R-sq | R-sq(adj) | AICc | BIC |
---|---|---|---|---|
0.170071 | 92.33% | 90.20% | 12.54 | 13.52 |
In these results, the estimated standard deviation (S) of the random error term is 0.17. The model explains 92.33% of the variation in the yield of alfalfa plants. After adjusting for the number of fixed factor parameters in the model, the percentage reduces to 90.2%.
If the p-value indicates that a term is significant, you can examine the coefficients for the term to understand how the term relates to the response. The interpretation of each coefficient depends on whether it is for a fixed factor term or for a covariate term.
The coefficients for a fixed factor term display how the level means for the term differ. You can also perform a multiple comparisons analysis for the term to further classify the level effects into groups that are statistically the same or statistically different.
The coefficient for a covariate term represents the change in the mean response associated with a 1-unit change in that term, while everything else in the model is the same. The sign of the coefficient indicates the direction of the relationship between the term and the response. The size of the coefficient usually provides a good way to assess the practical significance of the term on the response variable.
Term | Coef | SE Coef | DF | T-Value | P-Value |
---|---|---|---|---|---|
Constant | 3.094583 | 0.143822 | 3.00 | 21.516692 | 0.000 |
Variety | |||||
1 | 0.385417 | 0.077626 | 15.00 | 4.965016 | 0.000 |
2 | 0.145417 | 0.077626 | 15.00 | 1.873287 | 0.081 |
3 | 0.107917 | 0.077626 | 15.00 | 1.390205 | 0.185 |
4 | -0.319583 | 0.077626 | 15.00 | -4.116938 | 0.001 |
5 | 0.395417 | 0.077626 | 15.00 | 5.093838 | 0.000 |
Of the six varieties of alfalfa in the experiment, the output displays the coefficients for five types. By default, Minitab removes one factor level to avoid perfect multicollinearity. The coefficients for the main effects represent the difference between each level mean and the overall mean. For example, Variety 1 is associated with an alfalfa yield that is approximately 0.385 units greater than the overall mean.
Use the residual plots to help you determine whether the model is adequate and meets the assumptions of the analysis. If the assumptions are not met, the model may not fit the data well and you should use caution when you interpret the results.
You can plot marginal and conditional residuals. A marginal residual equals the difference between an observed response value and the corresponding estimated mean response without conditioning on the levels of the random factors. In contrast, given the specific levels of the random factors, a conditional residual equals the difference between an observed response value and the corresponding conditional mean response. Use the conditional residuals to check the normality of the error term in the model.
The residuals versus fits graph plots the residuals on the y-axis and the fitted values on the x-axis. Use this graph to identify rows of data with much larger residuals than other rows. Further investigate those rows to see whether they are collected correctly. In addition, you can also use this plot to look for specific patterns in the residuals that may indicate additional variables to consider.
The residuals versus order plot displays the residuals in the order that the data were collected. Use this graph to identify rows of data with much larger residuals than other rows. Further investigate those rows to see whether they are collected correctly. If the plot shows a pattern in time order, you can try to include a time-dependent term in the model to remove the pattern.