Contingency tables

In This Topic

What is a contingency table?
Calculate the odds ratio and confidence interval for a 2 X 2 contingency table

What is a contingency table?

A contingency table is table that tallies observations by multiple categorical variables. The tables' rows and columns correspond to these categorical variables.

For example, after a recent election between two candidates, an exit poll recorded the gender and vote of 100 random voters and tabulated the data as follows:

	Candidate A	Candidate B	All
Male	28	20	48
Female	39	13	52
All	67	33	100

This contingency table tallies responses by gender and vote. The count at the intersection of row i and column j is identified by n_ij, and it represents the number of observations that exhibit that combination of levels. For example, n_1,2 displays the number of male respondents who voted for Candidate B.

The table also includes marginal totals for each level of the variables. The marginal totals for the rows show that 52 of the respondents were female. Marginal totals for columns show that 67 respondents voted for Candidate A. Also, the grand total shows that the sample size is 100.

Contingency tables can also reveal associations between the two variables. Use a chi-square test or Fisher's exact test to determine whether the observed counts differ significantly from the expected counts under the null hypothesis of no association. For example, you could test whether an association exists between gender and vote.

The simplest contingency tables are two-way tables that tally the responses by two variables. You can categorize observations by three or more variables by "crossing" them. In the previous voting example, you could also classify the responses by employment status as follows:

	Candidate A	Candidate B	Total
Male / employed	18	19	37
Male / unemployed	10	1	11
Female / employed	33	10	43
Female / unemployed	6	3	9
Total	67	33	100

Simple correspondence analysis can detect associations in contingency tables that categorize data by more than two variables. To perform a simple correspondence analysis in Minitab, choose Stat > Multivariate > Simple Correspondence Analysis.

Calculate the odds ratio and confidence interval for a 2 X 2 contingency table

You can use Stat > Regression > Binary Logistic Regression > Fit Binary Logistic Model to calculate the odds ratio and confidence interval.

For example, you are investigating the relationship between aspirin use and heart attacks, and want to calculate the odds ratio and the confidence interval for the odds ratio for the following 2 X 2 contingency table:

	Heart Attack	No Heart Attack
Placebo	189	10845
Aspirin	104	10933

Enter the following data into Minitab:

C1 C2 C3

Group Heart Attack Count

Placebo Yes 189

Placebo No 10845

Aspirin Yes 104

Aspirin No 10933
Choose Stat > Regression > Binary Logistic Regression > Fit Binary Logistic Model.
In Response, enter C2 and in Frequency, enter C3.
In Categorical predictors, enter C1. Click OK.

C1	C2	C3
Placebo	Yes	189
Placebo	No	10845
Aspirin	Yes	104
Aspirin	No	10933

Binary Logistic Regression: Heart Attack versus Group

Odds Ratios for Categorical Predictors Level A Level B Odds Ratio 95% CI Group Placebo Asprin 1.8321 (1.4400, 2.3308) Odds ratio for level A relative to level B

The odds ratio is 1.8321. This means that a person taking the placebo has odds 1.8321 times larger of having a heart attack than a person taking aspirin. You can be 95% confident that the true value for the odds ratio is between 1.44 and 2.3308.

The data used in this example is from page 20 of A. Agresti (1996). An Introduction to Categorical Data Analysis. John Wiley & Sons, Inc.

C1	C2	C3
Group	Heart Attack	Count
Placebo	Yes	189
Placebo	No	10845
Aspirin	Yes	104
Aspirin	No	10933