Interpret the key results for Analyze Factorial Design

Complete the following steps to analyze a factorial design. Key output includes the Pareto chart, p-values, the coefficients, model summary statistics, and the residual plots.

In This Topic

Step 1: Determine which terms contribute the most to the variability in the response
Step 2: Determine which terms have statistically significant effects on the response
Step 3: Determine how well the model fits your data
Step 4: Determine whether your model meets the assumptions of the analysis

Step 1: Determine which terms contribute the most to the variability in the response

Use a Pareto chart of the standardized effects to compare the relative magnitude and the statistical significance of both main and interaction effects. The chart displays the type of effect as follows:

If the model does not include an error term, the chart displays the absolute value of the unstandardized effects.
If the model does include an error term, the chart displays the absolute value of the standardized effects.

Minitab plots the standardized effects in the decreasing order of their absolute values. The reference line on the chart indicates which effects are significant. By default, Minitab uses a significance level of 0.05 to draw the reference line. With no error term, Minitab uses Lenth's method to draw the reference line.

Key Results: Pareto Chart
In these results, the four main effects are statistically significant (α = 0.05). These significant effects include all four main effects - material type (A), injection pressure (B), injection temperature (C), and cooling temperature (D)
In addition, you can see that the largest effect is injection pressure (B) because it extends the farthest. The effect for the injection pressure by cooling temperature interaction (BD) is the smallest because it extends the least.

Step 2: Determine which terms have statistically significant effects on the response

To determine whether the association between the response and each term in the model is statistically significant, compare the p-value for the term to your significance level to assess the null hypothesis. The null hypothesis is that the term's coefficient is equal to zero, which implies that there is no association between the term and the response. Usually, a significance level (denoted as α or alpha) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that an association exists when there is no actual association.

P-value ≤ α: The association is statistically significant: If the p-value is less than or equal to the significance level, you can conclude that there is a statistically significant association between the response variable and the term.
P-value > α: The association is not statistically significant: If the p-value is greater than the significance level, you cannot conclude that there is a statistically significant association between the response variable and the term. You may want to refit the model without the term.; If there are multiple predictors without a statistically significant association with the response, you can reduce the model by removing terms one at a time. For more information on removing terms from the model, go to Model reduction.

If a model term is statistically significant, the interpretation depends on the type of term. The interpretations are as follows:

If a coefficient for a factor is significant, you can conclude that not all level means are equal.
If a coefficient for a covariate is significant, changes in the value of the variable are associated with changes in the mean response value.
If a coefficient for an interaction term is significant, the relationship between a factor and the response depends on the other factors in the term. In this case, you should not interpret the main effects without considering the interaction effect.

Analysis of Variance

Source	DF	Adj SS	Adj MS	F-Value	P-Value
Model	11	451.357	41.032	17.99	0.007
Covariates	1	3.591	3.591	1.58	0.278
MeasTemp	1	3.591	3.591	1.58	0.278
Linear	4	304.587	76.147	33.39	0.002
Material	1	35.053	35.053	15.37	0.017
InjPress	1	113.068	113.068	49.59	0.002
InjTemp	1	75.533	75.533	33.12	0.005
CoolTemp	1	38.666	38.666	16.96	0.015
2-Way Interactions	6	20.309	3.385	1.48	0.366
Material*InjPress	1	1.732	1.732	0.76	0.433
Material*InjTemp	1	3.045	3.045	1.34	0.312
Material*CoolTemp	1	0.095	0.095	0.04	0.848
InjPress*InjTemp	1	1.538	1.538	0.67	0.458
InjPress*CoolTemp	1	0.012	0.012	0.01	0.947
InjTemp*CoolTemp	1	14.694	14.694	6.44	0.064
Error	4	9.121	2.280
Total	15	460.478

Key Results: P-Value, Coefficients

In these results, the main effects for Material, InjPress, InjTemp, and CoolTemp are statistically significant at the significance level of 0.05. You can conclude that changes in these variables are associated with changes in the response variable.

MeasTemp is a covariate in this model. The coefficient for the main effect represents the change in the mean response for a one-unit increase in the covariate, while the other terms in the model are held constant. For each one-degree increase in temperature, the estimated mean strength decreases by −1.229.

The two-way interaction terms are not statistically significant. The relationship between each variable and the response may not depend on the value of the other variable.

Step 3: Determine how well the model fits your data

To determine how well the model fits your data, examine the goodness-of-fit statistics in the Model Summary table.

S

Use S to assess how well the model describes the response. Use S instead of the R² statistics to compare the fit of models that have no constant.

S is measured in the units of the response variable and represents how far the data values fall from the fitted values. The lower the value of S, the better the model describes the response. However, a low S value by itself does not indicate that the model meets the model assumptions. You should check the residual plots to verify the assumptions.

R-sq

The higher the R² value, the better the model fits your data. R² is always between 0% and 100%.

R² always increases when you add additional predictors to a model. For example, the best five-predictor model will always have an R² that is at least as high as the best four-predictor model. Therefore, R² is most useful when you compare models of the same size.

R-sq (adj)

Use adjusted R² when you want to compare models that have different numbers of predictors. R² always increases when you add a predictor to the model, even when there is no real improvement to the model. The adjusted R² value incorporates the number of predictors in the model to help you choose the correct model.

R-sq (pred)

Use predicted R² to determine how well your model predicts the response for new observations. Models that have larger predicted R² values have better predictive ability.

A predicted R² that is substantially less than R² may indicate that the model is over-fit. An over-fit model occurs when you add terms for effects that are not important in the population. The model becomes tailored to the sample data and, therefore, may not be useful for making predictions about the population.

Predicted R² can also be more useful than adjusted R² for comparing models because it is calculated with observations that are not included in the model calculation.

AICc and BIC

When you show the details for each step of a stepwise method or when you show the expanded results of the analysis, Minitab shows two more statistics. These statistics are the corrected Akaike’s Information Criterion (AICc) and the Bayesian Information Criterion (BIC). Use these statistics to compare different models. For each statistic, smaller values are desirable. Minitab does not show these statistics for split-plot designs.

Consider the following points when you interpret the goodness-of-fit statistics:

Small samples do not provide a precise estimate of the strength of the relationship between the response and predictors. If you need R² to be more precise, you should use a larger sample (typically, 40 or more).
R² is just one measure of how well the model fits the data. Even when a model has a high R², you should check the residual plots to verify that the model meets the model assumptions.

Model Summary

S	R-sq	R-sq(adj)	R-sq(pred)
1.51005	98.02%	92.57%	70.86%

Key Results: S, R-sq, R-sq (adj), R-sq (pred)

In these results, the model explains 98.02% of the variation in light output. For these data, the R² value indicates the model provides a good fit to the data. If additional models are fit with different predictors, use the adjusted R² values and the predicted R² values to compare how well the models fit the data.

Step 4: Determine whether your model meets the assumptions of the analysis

Use the residual plots to help you determine whether the model is adequate and meets the assumptions of the analysis. If the assumptions are not met, the model may not fit the data well and you should use caution when you interpret the results.

For more information on how to handle patterns in the residual plots, go to Residual plots for Analyze Factorial Design and click the name of the residual plot in the list at the top of the page.

Residuals versus fits plot

The patterns in the following table may indicate that the model does not meet the model assumptions.

Pattern	What the pattern may indicate
Fanning or uneven spreading of residuals across fitted values	Nonconstant variance
Curvilinear	A missing higher-order term
A point that is far away from zero	An outlier
A point that is far away from the other points in the x-direction	An influential point

Use the residuals versus fits plot to verify the assumption that the residuals are randomly distributed and have constant variance. Ideally, the points should fall randomly on both sides of 0, with no recognizable patterns in the points.

Residuals versus order plot

Use the residuals versus order plot to verify the assumption that the residuals are independent from one another. Independent residuals show no trends or patterns when displayed in time order. Patterns in the points may indicate that residuals near each other may be correlated, and thus, not independent. Ideally, the residuals on the plot should fall randomly around the center line:

If you see a pattern, investigate the cause. The following types of patterns may indicate that the residuals are dependent.

Normal probability plot of the residuals

Use the normal probability plot of the residuals to verify the assumption that the residuals are normally distributed. The normal probability plot of the residuals should approximately follow a straight line.