Optional results for Simple Correspondence Analysis

Contingency table

Suppose a contingency table has r rows and c columns. The entry, nij, in row i and column j of the contingency table is the frequency for that cell. The total in row i, ni., is the sum of the frequencies in row i. The total in column j, n.j, is the sum of the frequencies in column j. The total for the table, n.. or just n, is the sum of all the frequencies in the table.

Row and column profiles

Profiles are proportions that are calculated from the counts, nij, in the original contingency table. Specifically, the profile for row i is (ni1 / ni., ..., nic / ni.); the profile for column j is (n1j / n.j, ..., nrj / n.j).

The average row profile is calculated from the column totals. Specifically, the average row profile is (n.1 / n, ..., n.c / n). Similarly, the average column profile is calculated from the row totals. Specifically, the average column profile is (n1. / n,..., nr. / n).

Expected frequencies

Expected cell frequencies are calculated under the hypothesis that the row profiles, or equivalently the column profiles, are homogeneous. The expected frequency for the cell in row i and column j is calculated as follows:

Chi-square values

The χ2 value in the cell in row i and column j is calculated as follows:

If the observed and expected cell frequencies differ greatly, the χ2 value for the cell is large.

The χ2 statistic is the sum of the χ2 values in all the cells of the table. This statistic measures the discrepancy from homogeneity of the row profiles, or equivalently the column profiles. If the row (column) profiles are very different from each other, the χ2 statistic is large. The χ2 statistic can also be viewed as a measure of how far the row profiles (or equivalently the column profiles) are from the average row (column) profile.

Notation

TermDescription
nijobserved frequency in the cell
eijexpected frequency in the cell