Example of Simple Correspondence Analysis

A university research manager wants to determine how ten academic disciplines compare to each other in relation to five different funding categories. The manager collects 2-way classification data for 796 researchers.

For this two-way classification, the academic disciplines are rows and the funding categories are columns. A is the highest funding category, D is the lowest, and category E is unfunded. The manager performs a simple correspondence analysis to represent the associations between the rows and columns.

The manager also wants to examine supplementary data not included in the main data set. The supplementary data includes an additional row for museum researchers and a row for mathematical sciences, which is the sum of Mathematics and Statistics.

  1. Open the sample data set, ResearchFunding.MTW.
  2. Choose Stat > Multivariate > Simple Correspondence Analysis.
  3. Under Input Data, select Columns of a contingency table and enter CT1-CT5. In Row names, enter RowNames. In Column names, enter ColNames.
  4. Click Results and select Row profiles. Click OK.
  5. Click Supp Data. In Supplementary Rows, enter RowSupp1 RowSupp2. In Row names, enter RSNames. Click OK.
  6. Click Graphs. Select Show supplementary points in all plots. Select Symmetric plot showing rows only and Asymmetric row plot showing rows and columns.
  7. Click OK in each dialog box.

Interpret the results

The Row Profiles table shows the proportions of each row category by column. For example, for Geology, 3.5% of the researchers are in funding category A, 22.4% are in funding category B, and so on. The mass for each row indicates the proportion of researchers in the entire data set. For example, the mass for Geology is 0.107, which indicates that 10.7% of the researchers are in the Geology field.

You can use the values in the Row Contributions and Column Contributions tables to interpret the different components. The column labeled Qual, or quality, indicates the proportion of the inertia represented by the two components.

For example, for the row contributions, the fields Zoology (0.929) and Geology (0.916) are best represented among the fields by the two component breakdown. Math has the poorest representation, with a quality value of 0.319. For the column contributions, the two components explain most of the variability in funding categories B, D, and E. The funded categories A, B, C, and D contribute most to component 1, while the unfunded category, E, contributes most to component 2.

The row plot shows the row principal coordinates. Component 1, which best explains Zoology and Physics, shows these two fields farthest from the origin, but with opposite sign. Therefore, component 1 contrasts the biological sciences Zoology and Botany with Physics. Component 2 contrasts Biochemistry and Engineering with Geology.

In the asymmetric row plot, the rows are scaled in principal coordinates and the columns are scaled in standard coordinates. Among funding classes, Component 1 contrasts levels of funding, while Component 2 contrasts being funded (A to D) with not being funded (E). Among the disciplines, Physics tends to have the highest funding level and Zoology has the lowest. Biochemistry tends to be in the middle of the funding level, but highest among unfunded researchers. Museums tend to be funded, but at a lower level than academic researchers.

Row Profiles

ABCDEMass
Geology0.0350.2240.4590.1650.1180.107
Biochemistry0.0340.0690.4480.0340.4140.036
Chemistry0.0460.1920.3770.1620.2230.163
Zoology0.0250.1250.3420.2920.2170.151
Physics0.0880.1930.4120.0790.2280.143
Engineering0.0340.1250.2840.1700.3860.111
Microbiology0.0270.1620.3780.1350.2970.046
Botany0.0000.1400.3950.1980.2670.108
Statistics0.0690.1720.3790.1380.2410.036
Mathematics0.0260.1410.4740.1030.2560.098
Mass0.0390.1610.3890.1620.249 

Analysis of Contingency Table

AxisInertiaProportionCumulative
10.03910.47200.4720
20.03040.36660.8385
30.01090.13110.9697
40.00250.03031.0000
Total0.0829   

Row Contributions






Component  1Component  2
IDNameQualMassInertCoordCorrContrCoordCorrContr
1Geology0.9160.1070.137-0.0760.0550.016-0.3030.8610.322
2Biochemistry0.8810.0360.119-0.1800.1190.0300.4550.7620.248
3Chemistry0.6440.1630.021-0.0380.1340.006-0.0730.5100.029
4Zoology0.9290.1510.2300.3270.8460.413-0.1020.0830.052
5Physics0.8860.1430.196-0.3160.8800.365-0.0270.0060.003
6Engineering0.8700.1110.1520.1170.1210.0390.2920.7490.310
7Microbiology0.6800.0460.010-0.0130.0090.0000.1100.6710.018
8Botany0.6540.1080.0670.1790.6250.0880.0390.0290.005
9Statistics0.5610.0360.012-0.1250.5540.014-0.0140.0070.000
10Mathematics0.3190.0980.056-0.1070.2400.0290.0610.0790.012

Supplementary Rows






Component  1Component  2
IDNameQualMassInertCoordCorrContrCoordCorrContr
1Museums0.5560.0670.3530.3140.2250.168-0.3810.3310.318
2MathSci0.5590.1340.041-0.1120.4930.0430.0410.0660.007

Column Contributions






Component  1Component  2
IDNameQualMassInertCoordCorrContrCoordCorrContr
1A0.5870.0390.187-0.4780.5740.228-0.0720.0130.007
2B0.8160.1610.110-0.1270.2860.067-0.1730.5310.159
3C0.4650.3890.094-0.0830.3410.068-0.0500.1240.032
4D0.9680.1620.3470.3900.8590.632-0.1390.1090.103
5E0.9900.2490.2620.0320.0120.0060.2920.9780.699