The correlation matrix shows the correlation values, which measure the degree of linear relationship between each pair of variables. The correlation values can fall between -1 and +1. If the two variables tend to increase and decrease together, the correlation value is positive. If one variable increases while the other variable decreases, the correlation value is negative.
Use the correlation matrix to assess the strength and direction of the relationship between two variables. A high, positive correlation values indicates that the variables measure the same characteristic. If the items are not highly correlated, then the items may measure different characteristics or may not be clearly defined.
Use the Spearman correlation coefficient to examine the strength and direction of the monotonic relationship between two continuous or ordinal variables. In a monotonic relationship, the variables tend to move in the same relative direction, but not necessarily at a constant rate. To calculate the Spearman correlation, Minitab ranks the raw data. Then, Minitab calculates the correlation coefficient on the ranked data.
The correlation coefficient can range in value from −1 to +1. The larger the absolute value of the coefficient, the stronger the relationship between the variables.
For the Spearman correlation, an absolute value of 1 indicates that the rank-ordered data are perfectly linear. For example, a Spearman correlation of −1 means that the highest value for Variable A is associated with the lowest value for Variable B, the second highest value for Variable A is associated with the second lowest value for Variable B, and so on.
The sign of the coefficient indicates the direction of the relationship. If both variables tend to increase or decrease together, the coefficient is positive, and the line that represents the correlation slopes upward. If one variable tends to increase as the other decreases, the coefficient is negative, and the line that represents the correlation slopes downward.
The following plots show data with specific Spearman correlation coefficient values to illustrate different patterns in the strength and direction of the relationships between variables.
It is never appropriate to conclude that changes in one variable cause changes in another based on correlation alone. Only properly controlled experiments enable you to determine whether a relationship is causal.
The number of rows used displays in the Method table. It is the number of rows of data including missing values.
When you have missing values, the number of rows used is not the same as the actual sample size that is used in the confidence interval calculation.
The confidence interval provides a range of likely values for the correlation coefficients. Because samples are random, two samples from a population are unlikely to yield identical confidence intervals. But, if you repeated your sample many times, a certain percentage of the resulting confidence intervals or bounds would contain the unknown correlation coefficient. The percentage of these confidence intervals or bounds that contain the correlation coefficient is the confidence level of the interval.
For example, a 95% confidence level indicates that if you take 100 random samples from the population, you could expect approximately 95 of the samples to produce intervals that contain the correlation coefficient.
An upper bound defines a value that the population difference is likely to be less than. A lower bound defines a value that the population difference is likely to be greater than.
The confidence intervals for the Pearson correlation are sensitive to the normality of the underlying bivariate distribution. If the data deviate from normality, then the confidence intervals may be inaccurate regardless of the magnitude of the sample size.
The confidence intervals for Spearman correlations are based on ranks and are less sensitive to the underlying bivariate distribution assumption.
The confidence interval helps you assess the practical significance of your results. Use your specialized knowledge to determine whether the confidence interval includes values that have practical significance for your situation. If the interval is too wide to be useful, consider increasing your sample size. For more information, go to Ways to get a more precise confidence interval.
The p-value is a probability that measures the evidence against the null hypothesis. A smaller p-value provides stronger evidence against the null hypothesis.
Use the p-value to determine whether the correlation coefficient is statistically significant.
The p-value procedures for both Pearson correlation and Spearman correlation are robust to departures from normality. The p-values are usually accurate for n ≥ 25, regardless of the parent population of the sample.
The matrix plot is an array of scatterplots. Each scatterplot in the matrix graphs the scores for a pair of items on the x and y axes.
Use the plot to visually assess the relationship between every combination of variables. The relationships can be linear, monotonic, or neither. Also use the matrix plot to look for outliers that can heavily influence the results. For more information on the types of relationships, go to Linear, nonlinear, and monotonic relationships.