In Variables or distance matrix, enter either the columns that contain measurement data or a stored distance matrix that contains the distances between all pairs of variables.
If you enter a stored distance matrix, Minitab cannot calculate statistics for the final partition.
For measurement data, you must have two or more numeric columns, and each column must represent a different measurement. Delete rows that have missing data from the worksheet before you perform this analysis. If you have many rows of data, you may want to subset your worksheet to exclude the rows with missing values. For more information, go to Overview for Subset Worksheet.
You cannot enter a categorical variable for this analysis. If you have a categorical variable, you must first convert the text values to a numerical scale, or you must perform a separate analysis for each level of the categorical variable. For more information, go to Data considerations for Cluster Variables.
For the stored distance matrix, the entry in row i and column j of distance matrix D is the distance between variables i and j. For information on creating and using stored matrices in Minitab, go to Overview for Matrices.
C1 | C2 | C3 | C4 | C5 |
---|---|---|---|---|
Newspaper | Radio | TV Sets | Literacy Rate | University |
279 | 267 | 227 | 0.98 | 1 |
143 | 112 | 332 | 0.94 | 1 |
9 | 113 | 7 | 0.25 | 0 |
391 | 314 | 566 | 0.99 | 1 |
112 | 48 | 423 | 0.82 | 1 |
67 | 66 | 134 | 0.45 | 0 |
From Linkage method, select a method to specify how the distance between two clusters is defined. You might want to try several linkage methods to see which method provides the most useful results for your data.
For Cluster Observations, distance refers to the distance between observations, and linkage refers to the distance between the clusters of observations. For Cluster Variables, distance refers to the distance between variables, and linkage refers to the distance between the clusters of variables.
For the best results, you should be flexible with the criteria. For example, if you define the final partition using the number of clusters, you should also consider changes in similarity level, as well. A precipitous drop in similarity when adding a specific cluster might prompt you to specify the final partition before this grouping. Conversely, if you define the final partition using the similarity level, you might determine that similarity levels do not change much over a range of clusters, and for the sake of simplicity you may choose to go with the step with the fewest clusters.
If you do not know what value to enter to specify the final partition, first perform the analysis using the default setting (1 cluster in the final partition). Minitab displays the results for all possible numbers of clusters. Use the results to determine a value to enter for the final partition. Then repeat the analysis and specify the final partition that you determined. For more information, go to Determine the final grouping of clusters.
Select to display a tree diagram that shows how clusters were formed at each step in the amalgamation procedure. The dendrogram allows you to view the similarity (or distance) values for the clusters at each step.
To change the default display of the dendrogram, click Customize.