Enter your data for Principal Components Analysis

Stat > Multivariate > Principal Components

Specify the data for your analysis, enter the number of components to calculate, and specify the type of matrix.

Enter your data

In Variables, specify the columns of data that you want to analyze. You must have two or more columns of numeric data, with each column representing a different measurement. If a missing value exists in any column, Minitab ignores the entire row. Minitab excludes missing values from the calculation of the correlation or covariance matrix.

In this worksheet, each column contains measurements for a different type of information on a loan application.

C1 C2 C3 C4 C5 C6 C7 C8
Income Education Age Residence Employ Savings Debt Credit cards
50000 16 28 2 2 5000 1200 2
72000 18 35 10 8 12000 5400 4
61000 18 36 6 5 15000 1000 2
88000 20 35 4 4 980 1100 4
91100 18 38 8 9 20000 0 1
45100 14 41 15 14 3900 22000 4

Number of components to compute

Enter the number of principal components that you want Minitab to calculate. If you have a large number of variables, you may want to specify a smaller number of components to reduce the amount of output. If you do not know how many components to enter, you can leave this field blank. If you do not specify a number, Minitab calculates the maximum number of components, which equals the number of variables. You can then use the output to determine how many components explain most of the variation in the original variables.

Type of Matrix

Select the type of matrix to use to calculate the principal components.

  • Correlation: Use when your variables have different scales and you want to weight all the variables equally. For example, if some of the variables use a scale from 1-5 and others use a scale from 1-10, use the correlation matrix to standardize the scales.
  • Covariance: Use when your variables use the same scale, or when your variables have different scales but you want to give more emphasis to variables with higher variances.

For example, suppose you count different species of organisms at several sample sites. If you select the covariance matrix, the more common species will show higher variances and be given more emphasis. Very rare species will not affect the analysis as much. If you select a correlation matrix, all species are weighted equally. Therefore, very rare species may contribute significantly to the analysis results. Therefore, the decision depends on the objective of your study.