# Contingency tables

## What is a contingency table?

A contingency table is table that tallies observations by multiple categorical variables. The tables' rows and columns correspond to these categorical variables.

For example, after a recent election between two candidates, an exit poll recorded the gender and vote of 100 random voters and tabulated the data as follows:

Candidate A Candidate B All
Male 28 20 48
Female 39 13 52
All 67 33 100

This contingency table tallies responses by gender and vote. The count at the intersection of row i and column j is identified by nij, and it represents the number of observations that exhibit that combination of levels. For example, n1,2 displays the number of male respondents who voted for Candidate B.

The table also includes marginal totals for each level of the variables. The marginal totals for the rows show that 52 of the respondents were female. Marginal totals for columns show that 67 respondents voted for Candidate A. Also, the grand total shows that the sample size is 100.

Contingency tables can also reveal associations between the two variables. Use a chi-square test or Fisher's exact test to determine whether the observed counts differ significantly from the expected counts under the null hypothesis of no association. For example, you could test whether an association exists between gender and vote.

The simplest contingency tables are two-way tables that tally the responses by two variables. You can categorize observations by three or more variables by "crossing" them. In the previous voting example, you could also classify the responses by employment status as follows:

Candidate A Candidate B Total
Male / employed 18 19 37
Male / unemployed 10 1 11
Female / employed 33 10 43
Female / unemployed 6 3 9
Total 67 33 100

Simple correspondence analysis can detect associations in contingency tables that categorize data by more than two variables. To perform a simple correspondence analysis in Minitab, choose Stat > Multivariate > Simple Correspondence Analysis.

## Calculate the odds ratio and confidence interval for a 2 X 2 contingency table

You can use Stat > Regression > Binary Logistic Regression > Fit Binary Logistic Model to calculate the odds ratio and confidence interval.

For example, you are investigating the relationship between aspirin use and heart attacks, and want to calculate the odds ratio and the confidence interval for the odds ratio for the following 2 X 2 contingency table:
Heart Attack No Heart Attack
Placebo 189 10845
Aspirin 104 10933
1. Enter the following data into Minitab:
C1 C2 C3
Group Heart Attack Count
Placebo Yes 189
Placebo No 10845
Aspirin Yes 104
Aspirin No 10933
2. Choose Stat > Regression > Binary Logistic Regression > Fit Binary Logistic Model.
3. In Response, enter C2 and in Frequency, enter C3.
4. In Categorical predictors, enter C1. Click OK.

### Binary Logistic Regression: Heart Attack versus Group

Odds Ratios for Categorical Predictors Level A Level B Odds Ratio 95% CI Group Placebo Asprin 1.8321 (1.4400, 2.3308) Odds ratio for level A relative to level B

The odds ratio is 1.8321. This means that a person taking the placebo has odds 1.8321 times larger of having a heart attack than a person taking aspirin. You can be 95% confident that the true value for the odds ratio is between 1.44 and 2.3308.

The data used in this example is from page 20 of A. Agresti (1996). An Introduction to Categorical Data Analysis. John Wiley & Sons, Inc.

By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy