Chi-Square Test of Independence

Summary

Assesses observed differences in the rates of occurrence for a categorical output at different levels (settings) of an input. To use this test, the data for both variables (input and output) must be discrete or categorical. For example, X could be five different named hospitals and Y could be the likelihood of recovery (high, moderate, low, or unlikely).

Answers the question:
  • If the level of a discrete input changes, do the rates of occurrence of the possible outcomes also change?
When to Use Purpose
Mid-project Fixing an input at two or more different settings (levels) helps to determine which inputs have significant influence on the output profile (% by category).
Mid-project Verify changes to inputs result in significant differences from the pre-project output profile.

Data

A table containing the counts of each combination of the categorical X and Y

How-To

You can enter data in two ways:

  1. Enter a table in Minitab with the levels of one variable as columns and the levels of the second variable as rows. Note: It does not matter which variable, X or Y, is the column and which is the row. Enter the counts of the XY combinations into the table as shown in this example:
      Hospital
    Chance of Recovery A B C
    Good 78 45 98
    Moderate 45 57 55
    Poor 44 68 25

    To use data entered in this manner, choose Stat > Tables > Chi-Square Test (Table in Worksheet).

  2. Enter raw categorical data in columns, one column for the y-variable, and a second column for the x-variable. In this case, both columns must be the same length. For example, enter the value of the y-variable, Status (on time, late), in one column and the value of the x-variable, Publication Type (fiction, nonfiction, reference), into a second column. To use data entered in this manner, follow these steps:
    1. Choose Stat > Tables > Cross Tabulation and Chi-Square.
    2. In For rows, enter the y-variable and in For columns, enter the x-variable.
    3. Click Chi-Square, and then check the following options:
    • Chi-Square analysis
    • Expected cell counts
    • Each cell's contribution to the Chi-Square statistic

Guidelines

  • If an association exists between X and Y (low p-value), you must look at the chi-square contributions in the output table to locate any differences and look at the observed versus the expected values in the output table to determine if any observed differences are good or bad.
By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy