The histogram of the residuals shows the distribution of the residuals for all observations.
Use the histogram of the residuals to determine whether the data are skewed or include outliers. The patterns in the following table may indicate that the model does not meet the model assumptions.
What the pattern may indicate
A long tail in one direction
A bar that is far away from the other bars
Because the appearance of a histogram depends on the number of intervals used to group the data, don't use a histogram to assess the normality of the residuals. Instead, use a normal probability plot.
A histogram is most effective when you have approximately 20 or more data points. If the sample is too small, then each bar on the histogram does not contain enough data points to reliably show skewness or outliers.
Normal probability plot of residuals
The normal probability plot of the residuals displays the residuals versus their expected values when the distribution is normal.
Use the normal probability plot of the residuals to verify the assumption that the residuals are normally distributed. The normal probability plot of the residuals should approximately follow a straight line.
The following patterns violate the assumption that the residuals are normally distributed.
If you see a nonnormal pattern, use the other residual plots to check for other problems with the model, such as missing terms or a time order effect. If the residuals do not follow a normal distribution, the confidence intervals and p-values can be inaccurate.
Residuals versus fits
The residuals versus fits graph plots the residuals on the y-axis and the fitted values on the x-axis.
Use the residuals versus fits plot to verify the assumption that the residuals are randomly distributed and have constant variance. Ideally, the points should fall randomly on both sides of 0, with no recognizable patterns in the points.
The patterns in the following table may indicate that the model does not meet the model assumptions.
What the pattern may indicate
Fanning or uneven spreading of residuals across fitted values
A missing higher-order term
A point that is far away from zero
A point that is far away from the other points in the x-direction
An influential point
The following graphs show an outlier and a violation of the assumption that the variance of the residuals is constant.
If you identify any patterns or outliers in your residual versus fits plot, consider the following solutions:
Consider using Fit Regression
Model with a Box-Cox transformation or weights.
An outlier or influential point
Verify that the observation is not a measurement error or data-entry error.
Consider performing the analysis without this observation to determine how it impacts your results.
Residuals versus order
The residuals versus order plot displays the residuals in the order that the data were collected.
Use the residuals versus order plot to verify the assumption that the residuals are independent from one another. Independent residuals show no trends or patterns when displayed in time order. Patterns in the points may indicate that residuals near each other may be correlated, and thus, not independent. Ideally, the residuals on the plot should fall randomly around the center line:
If you see a pattern, investigate the cause. The following types of patterns may indicate that the residuals are dependent.
Residuals versus the variables
The residual versus variables plot displays the residuals versus another variable. The variable could already be included in your model. Or, the variable may not be in the model, but you suspect it affects the response.
If you see a non-random pattern in the residuals, it indicates that the variable affects the response in a systematic way. Consider including this variable in an analysis.
Residual plots for a test data set
Minitab creates separate residual plots for the training data set and the test data set. The residuals for the test data set are independent of the model fitting process.
Because the training and test data sets are typically from the same population, you expect to see the same patterns in the residual plots for each data set. Different patterns in the residual plots could indicate a systematic difference between the observations in the training data set and the test data set.
Although the patterns are typically the same, the residual plots for the test data set can be slightly different from the plots for the training data set. For example, because the test data set is not in the model fitting process, the mean of the residuals can be non-zero.