What is an unusual observation?

Unusual observations (also called influential observations) are observations that have a disproportionate impact on a regression or ANOVA model. Unusual observations are important to identify because they can produce misleading results. For example, an unusual observation can cause a significant coefficient to seem insignificant.

Unusual observations can be either, or both, of the following:
  • Leverage points, which are extreme in the x-direction
  • Outliers (large residuals), which are extreme in the y-direction relative to the fitted regression line

Identify unusual observations

To identify unusual observations, examine diagnostic measures including leverage values, residuals, Cook's D, and DFITS. Larger values of these statistics identify that an observation might be unusual. Minitab labels observations with extreme leverage or residual values (outliers) in the Fits and Diagnostic table for Unusual Observations as shown below:
  • An X denotes a point with a large leverage value. Minitab labels leverage values greater than 3 * number of model terms/number of observations or leverage values greater than 0.99, whichever is smaller.
  • An R denotes an extreme standardized residual. Minitab labels standardized residuals with absolute values greater than 2.

The observations that Minitab labels do not follow the proposed regression equation well. However, it is expected that you will have some unusual observations. For example, based on the criteria for large residuals, you would expect roughly 5% of your observations to be flagged as having a large residual.

Example of the table of unusual observations

Fits and Diagnostics for Unusual Observations Obs Heat Flux Fit Resid Std Resid 1 271.80 274.74 -2.94 -0.40 X 22 254.50 230.91 23.59 2.74 R R Large residual X Unusual X

In the previous output, observation 1 is denoted with an X, identifying it as a leverage point. Observation 22, denoted with an R, is an outlier.

Determine how unusual observations affect the model

To determine how much effect the unusual observation has, you can fit the model with and without the observation and compare the coefficients, p-values, R2, and other model parameters. If the model changes significantly when you remove the unusual observation, first, determine whether the observation is a data entry or measurement error. If not, examine the model more to determine whether you omitted an important term (for example, an interaction term) or variable, or have incorrectly specified the model. You might need to collect more data to determine a resolution.