Interpret the key results for Cross Tabulation and Chi-Square

Complete the following steps to interpret a cross tabulation analysis. Key output includes counts and expected counts, chi-square statistics, and p-values.

In This Topic

Step 1: Determine whether the association between the variables is statistically significant
Step 2: Examine the differences between expected counts and observed counts to determine which variable levels may have the most impact on association

Step 1: Determine whether the association between the variables is statistically significant

Use the p-value to determine whether to reject or fail to reject the null hypothesis, which states that the variables are independent.

To determine whether variables are independent, compare the p-value to the significance level. Usually, a significance level (denoted as α or alpha) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that an association between the variables exists when there is no actual association.

P-value ≤ α: The variables have a statistically significant association (Reject H₀): If the p-value is less than or equal to the significance level, you reject the null hypothesis and conclude that there is a statistically significant association between the variables.
P-value > α: Cannot conclude that the variables are associated (Fail to reject H₀): If the p-value is larger than the significance level, you fail to reject the null hypothesis because there is not enough evidence to conclude that the variables are associated.

Chi-Square Test

	Chi-Square	DF	P-Value
Pearson	11.788	4	0.019
Likelihood Ratio	11.816	4	0.019

Key Output: P-Value

In these results, the p-value is 0.019. Because the p-value is less than α, you must reject the null hypothesis. You can conclude that the variables are associated.

Step 2: Examine the differences between expected counts and observed counts to determine which variable levels may have the most impact on association

The observed count is the actual number of observations in a sample that belong to a category.

The expected count is the frequency that would be expected in a cell, on average, if the variables are independent. Minitab calculates the expected counts as the product of the row and column totals, divided by the total number of observations.

By looking at the differences between the observed cell counts and the expected cell counts, you can see which variables have the largest differences, which may indicate dependence. You can also compare the standardized residuals to see which variables have the largest difference between the expected counts and the actual counts relative to sample size.

Rows: Machine ID Columns: Worksheet columns

	1st shift	2nd shift	3rd shift	All

1	48	47	48	143
	56.08	46.97	39.96
	-1.0788	0.0050	1.2726

2	76	47	32	155
	60.78	50.91	43.31
	1.9516	-0.5476	-1.7184

3	36	40	34	110
	43.14	36.13	30.74
	-1.0867	0.6443	0.5889

All	160	134	114	408

Key Results: Counts, Expected Counts, Standardized residual

In this cross tabulation table, the cell count is the first number in each cell, the expected count is the second number in each cell, and the standardized residual is the third number in each cell. In these results, the expected count and the observed count are the largest for the 1st shift with Machine 2, and the standardized residual is also the largest.