Tabulated statistics for Cross Tabulation and Chi-Square

Find definitions and interpretation guidance for every statistic that is provided with the cross tabulation of categorical variables.

Observed and expected counts

The observed count is the actual number of observations in a sample that belong to a category.

The expected count is the frequency that would be expected in a cell, on average, if the variables are independent. Minitab calculates the expected counts as the product of the row and column totals, divided by the total number of observations.

Interpretation

You can compare the observed values and the expected values for each cell in the output table. In these results, the observed cell count is the first number in each cell and the expected count is the second number in each cell.

If two variables are associated, then the distribution of observations for one variable will differ depending on the category of the second variable. If two variables are independent, then the distribution of observations for one variable will be similar for all categories of the second variable. In this example, from column 1, row 2 of the table, the observed count is 76, and the expected count is 60.78. The observed count seems to be much larger than would be expected if the variables were independent.

Rows: Machine ID   Columns: Worksheet columns

1st shift2nd shift3rd shiftAll
         
1484748143
  56.0846.9739.96 
         
2764732155
  60.7850.9143.31 
         
3364034110
  43.1436.1330.74 
         
All160134114408
Cell Contents
      Count
      Expected count
A better way to compare observed counts and expected counts is with the Standardized residuals.

Table percentages (% of Row, % of Column, % of Total)

For each cell, Minitab displays the table percentages that you select.
% of Row
The percentage that each cell represents within a table row. Minitab calculates the row percentage for each cell by dividing the cell count by the row total.
% of Column
The percentage that each cell represents within a table column. Minitab calculates the column percentage for each cell by dividing the cell count by the column total.
% of Total
The percentage that each cell represents of the total observations. Minitab calculates the total percentage for each cell by dividing the cell count by the overall total.

Interpretation

Use the table percentages to understand how the counts are distributed between the categories.

In these results, the cell count is the first number in each cell. Then the row percentages, column percentages, and total percentages are in order as the next numbers in the cell. You can select one or more of these percentages to display.

For example, for the data from Machine 1 and 1st shift:
  • The cell count is 48.
  • The row percentage is 33.57%, which is 48 divided by 143.
  • The column percentage is 30.00%, which is 48 divided by 160.
  • The total percentage is 11.76%, which is 48 divided by 408.

Rows: Machine ID   Columns: Worksheet columns

1st shift2nd shift3rd shiftAll
         
1484748143
  33.5732.8733.57100.00
  30.0035.0742.1135.05
  11.7611.5211.7635.05
         
2764732155
  49.0330.3220.65100.00
  47.5035.0728.0737.99
  18.6311.527.8437.99
         
3364034110
  32.7336.3630.91100.00
  22.5029.8529.8226.96
  8.829.808.3326.96
         
All160134114408
  39.2232.8427.94100.00
  100.00100.00100.00100.00
  39.2232.8427.94100.00
Cell Contents
      Count
      % of Row
      % of Column
      % of Total

Raw residuals

The raw residuals are the differences between observed counts and expected counts.
Observed count
The observed count is the actual number of observations in a sample that belong to a category.
Expected count

The expected count is the frequency that would be expected in a cell, on average, if the variables are independent. Minitab calculates the expected counts as the product of the row and column totals, divided by the total number of observations.

Interpretation

You can compare the observed values and the expected values in the output table.

In these results, the cell count is the first number in each cell, the expected count is the second number in each cell, and the raw residual is the third number in each cell. Machine 2, 1st shift has the largest raw residual, which means that the greatest difference between expected and actual defects is found on Machine 2 during the 1st shift.

Rows: Machine ID   Columns: Worksheet columns

1st shift2nd shift3rd shiftAll
         
1484748143
  56.0846.9739.96 
  -8.0780.0348.044 
         
2764732155
  60.7850.9143.31 
  15.216-3.907-11.309 
         
3364034110
  43.1436.1330.74 
  -7.1373.8733.265 
         
All160134114408
Cell Contents
      Count
      Expected count
      Residual
A better way to compare observed counts and expected counts is with the standardized residuals.

Standardized residuals

The standardized residuals are the raw residuals (or the difference between the observed counts and expected counts), divided by the square root of the expected counts.

Interpretation

You can compare the standardized residuals in the output table to see which category of variables have the largest difference between the expected counts and the actual counts relative to sample size, and seem to be dependent. For example, you can assess the standardized residuals in the output table to see the association between machine and shift for producing defects.

In these results, the cell count is the first number in each cell, the expected count is the second number in each cell, and the standardized residual is the third number in each cell. The positive standardized residuals indicate that there were more defective handles than expected. The negative standardized residuals indicate that there were less defective handles than expected.

Rows: Machine ID   Columns: Worksheet columns

1st shift2nd shift3rd shiftAll
         
1484748143
  56.0846.9739.96 
  -1.07880.00501.2726 
         
2764732155
  60.7850.9143.31 
  1.9516-0.5476-1.7184 
         
3364034110
  43.1436.1330.74 
  -1.08670.64430.5889 
         
All160134114408
Cell Contents
      Count
      Expected count
      Standardized residual

Adjusted residuals

The adjusted residuals are the raw residuals (or the difference between the observed counts and expected counts) divided by an estimate of the standard error. Use adjusted residuals to account for the variation due to the sample size.

Interpretation

You can compare the adjusted residuals in the output table to see which categories have the largest difference between the expected counts and the actual counts relative to sample size. For example, you can see which machine or shift has the largest difference between the expected number of defectives and the actual number of defectives.

In these results, the cell count is the first number in each cell, the expected count is the second number in each cell, and the adjusted residual is the third number in each cell. The positive adjusted residuals indicate that there were more defective handles than expected, adjusted for sample size. The negative adjusted residuals indicate that there were less defective handles than expected, adjusted for sample size.

Rows: Machine ID   Columns: Worksheet columns

1st shift2nd shift3rd shiftAll
         
1484748143
  56.0846.9739.96 
  -1.71690.00761.8602 
         
2764732155
  60.7850.9143.31 
  3.1788-0.8485-2.5707 
         
3364034110
  43.1436.1330.74 
  -1.63090.91990.8117 
         
All160134114408
Cell Contents
      Count
      Expected count
      Adjusted residual

Contribution to chi-square

Minitab displays each cell's contribution to the chi-square statistic, which quantifies how much of the total chi-square statistic is attributable to each cell's divergence.

Minitab calculates each cell's contribution to the chi-square statistic as the square of the difference between the observed and expected values for a cell, divided by the expected value for that cell. The chi-square statistic is the sum of these values for all cells.

Interpretation

In these results, the sum of the chi-square from each cell is the Pearson chi-square statistic which is 11.788. The largest contributions are from Machine 2, on the 1st and 3rd shift. The smallest contributions are from the 2nd shift, on Machines 1 and 2.

Rows: Machine ID   Columns: Worksheet columns

1st shift2nd shift3rd shiftAll
         
1484748143
  56.0846.9739.96 
  1.16370.00001.6195 
         
2764732155
  60.7850.9143.31 
  3.80880.29982.9530 
         
3364034110
  43.1436.1330.74 
  1.18090.41510.3468 
         
All160134114408
Cell Contents
      Count
      Expected count
      Contribution to Chi-square