Overview for Cluster Variables

Use Cluster Variables to group variables into clusters that share common characteristics. Clustering variables allows you to reduce the number of variables for analysis. This analysis is appropriate when you do not have any initial information about how to form the groups.

For example, a social scientist uses cluster variables to study effects of the number of media outlets and universities and the literacy rate on the college admissions of the population. The scientist wants to reduce the total number of variables by combining variables with similar characteristics.

Cluster variables uses a hierarchical procedure to form the clusters. Variables are grouped together that are similar (correlated) with each other. At each step, two clusters are joined, until just one cluster is formed at the final step. Minitab calculates similarity and distance values for the clusters at each step to help you select the final grouping of variables. You can also display a dendrogram to visualize the clustering results at each step.

Where to find this analysis

To cluster variables, choose Stat > Multivariate > Cluster Variables.

When to use an alternate analysis

  • To calculate pairwise correlations across a group of variables, use Correlation.
  • To create new variables (principal components) that are linear combinations of the observed variables, use Principal Components Analysis.
  • If you want to group observations instead of variables, and you do not have any initial information about how to form the groups, use Cluster Observations.
  • If you want to group observations instead of variables, and you have sufficient information to make good starting cluster designations, use Cluster K-Means.