# Correlation

## Summary

Measures the strength and direction of a linear relationship between two variables.

• How strong is the linear relationship between two variables?
When to Use Purpose
Start of project Helps develop alternative measurement systems in cases when a variable is difficult or expensive to measure; highly correlated and logically linked alternative variables can be used as substitute variables.
Mid-project Assess if an input (X) has a strong linear relationship with an output (Y) to help eliminate noncritical X's from consideration.
Mid-project Evaluate two inputs to identify whether they duplicate the same information. For example, inputs of "Degree Obtained" and "Years of School" are likely to explain the same variation of the output, so one of them may be eliminated. This is used primarily in multiple regression with many variables.

### Data

Two or more numeric variables (Y versus X or X1 versus X2, and so on)

## How-To

Collect numeric data and enter it in a Minitab worksheet, one column per variable.

## Guidelines

• Correlation does not imply causation. Shark attacks and ice cream sales in Florida are highly correlated, but one does not cause the other.
• Pearson’s correlation measures strength and direction of a linear model; it cannot evaluate quadratics, cubics, and so on. To evaluate nonlinear models or to identify outliers, use a fitted line plot instead.
• Minitab provides a p-value to determine the statistical significance of the correlation; however, the practical significance depends on the situation.
• Discrete variables should have at least 10 distinct values.
By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy