Measures of association for Cross Tabulation and Chi-Square

Find definitions and interpretation guidance for every statistic that is provided with the measures of association.

In This Topic

Fisher's exact test, P-value
McNemar's test
Cochran-Mantel-Haenszel's test
Cramer's V-square

Kappa
Goodman-Kruskal Lambda and Tau
Concordance measures for ordinal categories
Pearson's r and Spearman's rho

Fisher's exact test, P-value

Fisher's exact test is a test of independence. Fisher's exact test is useful when the expected cell counts are low and the chi-square approximation is not very good.

The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.

Use the p-value to determine whether to reject or fail to reject the null hypothesis, which states that the variables are independent.

Interpretation

To determine whether variables are independent, compare the p-value to the significance level. Usually, a significance level (denoted as α or alpha) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that an association between the variables exists when there is no actual association.

P-value ≤ α: The variables have a statistically significant association (Reject H₀): If the p-value is less than or equal to the significance level, you reject the null hypothesis and conclude that there is a statistically significant association between the variables.
P-value > α: Cannot conclude that the variables are associated (Fail to reject H₀): If the p-value is larger than the significance level, you fail to reject the null hypothesis because there is not enough evidence to conclude that the variables are associated.

For more information, go to What is Fisher's exact test?.

McNemar's test

Use McNemar's test to determine whether paired proportions are different.

Interpretation

Estimated difference

Minitab calculates the difference between the marginal proportions.

95% CI

Minitab calculates the 95% confidence interval for the difference between the marginal probabilities.

95% confidence intervals (95% CI) are the ranges of values that are likely to contain the true value of the difference between the marginal probabilities.

P

Minitab calculates the p-value to test the null hypothesis.

To determine whether the marginal probabilities are significantly different, compare the p-value to your significance level (denoted as α or alpha) to assess the null hypothesis. The null hypothesis states that the marginal probabilities are equal. Usually, a significance level of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that a difference exists when it does not.

P-value ≤ α: The marginal probabilities are statistically different: If the p-value is less than or equal to the significance level, you reject the null hypothesis and conclude that the marginal probabilities are significantly different. For example, the before probability is different from the after probability.
P-value > α: The marginal probabilities are not significantly different: If the p-value is greater than the significance level, you fail to reject the null hypothesis because you do not have enough evidence to conclude that the marginal probabilities are different. For example, you can not conclude that the before and after probabilities are different.

For more information, go to Why should I use McNemar's test?.

Cochran-Mantel-Haenszel's test

Use the CMH test to test the conditional association of two binary variables in the presence of a third categorical variable.

Minitab calculates a common-odds ratio across the tables and a p-value to assess its significance.

Interpretation

Common-odds ratio: Minitab calculates the common-odds ratio, which indicates the strength of association.
CMH statistic: The CMH statistic is used to indicate whether the association is statistically significant.
DF: The CMH statistic is compared with a chi-square percentile with one degree of freedom.
P-value: Minitab calculates the p-value to test the null hypothesis.; Use the p-value to determine whether to reject or fail to reject the null hypothesis, which states that the two binary variables are independent, conditional on the third variable.

For more information, go to What is the Cochran-Mantel-Haenszel test?.

Cramer's V-square

Cramer's V² measures association between two variables (the row variable and the column variable). Cramer's V² values range from 0 to 1. Larger values for Cramer's V² indicate a stronger relationship between the variables, and smaller value for V² indicate a weaker relationship. A value of 0 indicates that there is no association. A value of 1 indicates that there is a very strong association between the variables.

Kappa

Kappa measures the degree of agreement of the nominal or ordinal assessments that are made by multiple appraisers when assessing the same samples. When you have ordinal ratings, such as defect severity ratings on a scale of 1-5, the concordance measures for ordinal categories, which take ordering into consideration, are usually more appropriate statistics to determine association than kappa alone.

Interpretation

Kappa values range from -1 to +1. The higher the value of kappa, the stronger the agreement.

When:

Kappa = 1, perfect agreement exists.
Kappa = 0, agreement is the same as would be expected by chance.
Kappa < 0, agreement is weaker than expected by chance; this rarely occurs.

Goodman-Kruskal Lambda and Tau

Goodman-Kruskal lambda (λ) and tau (τ) measure the strength of association based on the ability to correctly guess or predict the value of one variable when you know the value of the other. Lambda is based on modal probabilities, while tau is based on random category assignment.

Interpretation

Lambda (λ): Lambda measures the percentage improvement in probability of the dependent variable (column or row variable) given the value of other variables (row or column variables).; Values of lambda range from 0 to 1. A value of 0 means that the independent variable does not improve the prediction of the categories of the dependent variable. A value of 1 means that the independent variable completely predicts the categories of the dependent variable. A value of 0.5 means that the prediction error is reduced by 50%.
Tau (τ): Tau measures the percentage improvement in predictability of the dependent variable (column or row variable) given the value of other variables (row or column variables). Goodman-Kruskal tau is the same as Goodman-Kruskal lambda except the calculations of the tau statistic are based on assignment probabilities specified by marginal or conditional proportions.; Values of tau range from −1 (perfect negative association) to +1 (perfect positive association). A value of 0 indicates the absence of association.

For more information, go to What are the Goodman-Kruskal statistics?.

Concordance measures for ordinal categories

Number of concordant and discordant pairs: Use concordant and discordant pairs to describe the relationship between pairs of observations. To calculate the concordant and discordant pairs, the data are treated as ordinal, so ordinal data should be appropriate for your application. The number of concordant and discordant pairs are used in calculations for Kendall's tau, which measures the association between two ordinal variables.; For more information, go to What are concordant and discordant pairs?.
Goodman-Kruskal Gamma (γ): Goodman-Kruskal gamma (γ) shows how many more concordant than discordant pairs exist divided by the total number of pairs excluding ties. Use the Goodman-Kruskal gamma to measure the association between the ordinal variables.; Perfect association exists when |γ| = 1. In ordinal and binary logistic regression, if X and Y are independent, then γ = 0.; For more information, go to What are the Goodman-Kruskal statistics?.
Somers' D: Somers' D measures the strength and direction of the relationship between pairs of variables. Somers' D values range from -1 (all pairs disagree) to 1 (all pairs agree).; Minitab displays two values for D, one value for when the row variable is the dependent variable, and one value for when the column variable is the dependent variable. You must decide which case is appropriate for your analysis.
Kendall's Tau-b: Kendall's tau-b is used in cross tabulation to measure the association between two ordinal variables.; Kendall's tau-b values range from -1.0 to 1.0. A positive value indicates that both variables increase together. A negative value indicates that both variables decrease together.

Test of Concordance

The test of concordance is a test of independence. The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis. Use the p-value to determine whether to reject or fail to reject the null hypothesis, which states that the categorical variables are independent.

If the p-value is less than or equal to the significance level, you reject the null hypothesis and conclude that there is a statistically significant association between the variables. If the p-value is larger than the significance level, you fail to reject the null hypothesis because there is not enough evidence to conclude that the variables are associated.

Pearson's r and Spearman's rho

Use Pearson's r and Spearman's rho to assess the association between two variables that have ordinal categories. Ordinal categories have a natural order, such as small, medium, and large.

The coefficient can range in value from -1 to +1. The larger the absolute value of the coefficient, the stronger the relationship between the variables. An absolute value of 1 indicates a perfect relationship, and a value of zero indicates the absence of an ordinal relationship. Whether an intermediate value is interpreted as a weak, medium, or strong correlation depends on your goals and requirements.

For more information, go to What are Spearman's rho and Pearson's r for ordinal categories?.