Measures the strength and direction of a linear relationship between two variables.

Answers the question:

- How strong is the linear relationship between two variables?

When to Use | Purpose |
---|---|

Start of project | Helps develop alternative measurement systems in cases when a variable is difficult or expensive to measure; highly correlated and logically linked alternative variables can be used as substitute variables. |

Mid-project | Assess if an input (X) has a strong linear relationship with an output (Y) to help eliminate noncritical X's from consideration. |

Mid-project | Evaluate two inputs to identify whether they duplicate the same information. For example, inputs of "Degree Obtained" and "Years of School" are likely to explain the same variation of the output, so one of them may be eliminated. This is used primarily in multiple regression with many variables. |

Two or more numeric variables (Y versus X or X1 versus X2, and so on)

Collect numeric data and enter it in a Minitab worksheet, one column per variable.

- Correlation does not imply causation. Shark attacks and ice cream sales in Florida are highly correlated, but one does not cause the other.
- Pearson’s correlation measures strength and direction of a linear model; it cannot evaluate quadratics, cubics, and so on. To evaluate nonlinear models or to identify outliers, use a fitted line plot instead.
- Minitab provides a p-value to determine the statistical significance of the correlation; however, the practical significance depends on the situation.
- Discrete variables should have at least 10 distinct values.