Interpret all statistics for Cross Tabulation and Chi-Square

Find definitions and interpretation guidance for every statistic that is provided with the cross tabulation analysis.

Adjusted residuals

The adjusted residuals are the raw residuals (or the difference between the observed counts and expected counts) divided by an estimate of the standard error. Use adjusted residuals to account for the variation due to the sample size.

Minitab estimates the standard deviation of the observed counts using the formula found in Adjusted residuals.

Interpretation

You can compare the adjusted residuals in the output table to see which categories have the largest difference between the expected counts and the actual counts relative to sample size. For example, you can see which machine or shift has the largest difference between the expected number of defectives and the actual number of defectives.

In these results, the cell count is the first number in each cell, the expected count is the second number in each cell, and the adjusted residual is the third number in each cell. The positive adjusted residuals indicate that there were more defective handles than expected, adjusted for sample size. The negative adjusted residuals indicate that there were less defective handles than expected, adjusted for sample size.
Rows: Machine ID     Columns: Worksheet columns
 
1st shift
2nd shift
3rd shift
Count
 
Expected count
 
Adjusted residual

Chi-square statistic

Minitab performs a Pearson chi-square test and a likelihood-ratio chi-square test. Each chi-square test can be used to determine whether or not the variables are associated (dependent).
Pearson chi-square test

The Pearson chi-square statistic (χ2) involves the squared difference between the observed and the expected frequencies.

Likelihood-ratio chi-square test

The likelihood-ratio chi-square statistic (G2) is based on the ratio of the observed to the expected frequencies.

Interpretation

Use the chi-square statistics to test whether the variables are associated.

In these results, both chi-square statistics are very similar. Use the p-values to evaluate the significance of the chi-square statistics.
Chi-Square Test
 
Chi-Square
DF
P-Value
When the expected counts are small, your results may be misleading. For more information, go to Data considerations for Cross Tabulation and Chi-Square.

Contribution to chi-square

Minitab displays each cell's contribution to the chi-square statistic, which quantifies how much of the total chi-square statistic is attributable to each cell's divergence.

Minitab calculates each cell's contribution to the chi-square statistic as the square of the difference between the observed and expected values for a cell, divided by the expected value for that cell. The chi-square statistic is the sum of these values for all cells.

Interpretation

Use the individual cell contributions to quantify how much of the total chi-square statistic is attributable to each cell's divergence.

In these results, the sum of the chi-square from each cell is the Pearson chi-square statistic which is 11.788. The largest contributions are from Machine 2, on the 1st and 3rd shift. The smallest contributions are from the 2nd shift, on Machines 1 and 2.
Rows: Machine ID     Columns: Worksheet columns
 
1st shift
2nd shift
3rd shift
Count
 
Expected count
 
Contribution to Chi-square

DF

The degrees of freedom (DF) is the number of independent pieces of information on a statistic. The degrees of freedom for a cross tabulation is the number of rows - 1, multiplied by the number of columns - 1.

Interpretation

Minitab uses the degrees of freedom to determine the p-value associated with the test statistic.

In these results, the degrees of freedom (DF) is 4.
Chi-Square Test
 
Chi-Square
DF
P-Value

Observed and expected counts

The observed counts are the actual number of observations in a sample that belong to a category.

The expected counts value is the projected frequency that would be expected in a cell, if the variables are independent. Minitab calculates the expected counts as the product of the row and column totals, divided by the sample size.

Interpretation

You can compare the observed values and the expected values in the output table.

In these results, the cell counts value is the first number in each cell and the expected counts value is the second number in each cell. The expected counts seem to be close to the observed counts for all categories.
Rows: Machine ID     Columns: Worksheet columns
 
1st shift
2nd shift
3rd shift
Count
 
Expected count
A better way to compare observed counts and expected counts is with the Standardized residuals.

P-value

The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.

Use the p-value to determine whether to reject or fail to reject the null hypothesis, which states that the variables are independent.

Minitab uses the chi-square statistic to determine the p-value.

Note

Minitab does not display the p-value when any expected count is less than 1 because the results can be invalid.

Interpretation

To determine whether variables are independent, compare the p-value to the significance level. Usually, a significance level (denoted as α or alpha) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that an association between the variables exists when there is no actual association.
P-value ≤ α: The variables have a statistical association (Reject H0)
If the p-value is less than or equal to the significance level, you reject the null hypothesis and conclude that there is a statistically significant association between the variables.
P-value > α: The variables have no association (Fail to reject H0)
If the p-value is larger than the significance level, you fail to reject the null hypothesis because there is not enough evidence to conclude that the variables are associated.
In these results, the p-value is 0.019. Because the p-value is less than α, you reject the null hypothesis. You can conclude that the variables are associated.
Chi-Square Test
 
Chi-Square
DF
P-Value

Raw residuals

Raw residuals is the difference between observed counts and expected counts.
Observed counts
The observed counts is the actual number of observations in a sample that belong to a category.
Expected counts
The expected counts value is the projected frequency that would be expected in a cell, if the variables are independent. Minitab calculates the expected counts as the product of the row and column totals, divided by the sample size.

Interpretation

You can compare the observed values and the expected values in the output table.

In these results, the cell count is the first number in each cell, the expected count is the second number in each cell, and the raw residual is the third number in each cell. Machine 2, 2nd shift has the largest raw residual, which means that the greatest difference between expected and actual defects is found on Machine 2 during the 2nd shift.
Rows: Machine ID     Columns: Worksheet columns
 
1st shift
2nd shift
3rd shift
Count
 
Expected count
 
Residual
A better way to compare observed counts and expected counts is with the standardized residuals.

Standardized residuals

The standardized residuals are the raw residuals (or the difference between the observed counts and expected counts), divided by the square root of the expected counts.

Interpretation

You can compare the standardized residuals in the output table to see which category of variables have the largest difference between the expected counts and the actual counts relative to size, and seem to be dependent. For example, you can assess the standardized residuals in the output table to see the association between machine and shift for producing defects.

In these results, the cell count is the first number in each cell, the expected count is the second number in each cell, and the standardized residual is the third number in each cell. The positive standardized residuals indicate that there were more defective handles than expected. The negative standardized residuals indicate that there were less defective handles than expected.
Rows: Machine ID     Columns: Worksheet columns
 
1st shift
2nd shift
3rd shift
All
Count
 
Expected count
 
Standardized residual

Table percentages (% of Row, % of Column, % of Total)

For each cell, Minitab displays the table percentages that you select.
% of Row
The percentage that each cell represents within a table row. Minitab calculates the row percentage for each cell by dividing the cell count by the row total.
% of Column
The percentage that each cell represents within a table column. Minitab calculates the column percentage for each cell by dividing the cell count by the column total.
% of Total
The percentage that each cell represents of the total observations. Minitab calculates the total percentage for each cell by dividing the cell count by the overall total.

Interpretation

Use the table percentages to understand how the counts are distributed between the categories.

In these results, the cell count is the first number in each cell. Then the row percentages, column percentages, and total percentages are in order as the next numbers in the cell. You can select one or more of these percentages to display.

For example, for the data from Machine 1 and 1st shift:
  • The cell count is 48.
  • The row percentage is 33.57%, which is 48 divided by 143.
  • The column percentage is 30.00%, which is 48 divided by 160.
  • The total percentage is 11.76%, which is 48 divided by 408.
Rows: Machine ID     Columns: Worksheet columns
 
1st shift
2nd shift
3rd shift
Count
 
% of Row
 
% of Column
 
% of Total
By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy