Amalgamation steps

Find definitions and interpretation for every statistic that is provided in the results for the amalgamation steps.

Step

The number of the step in the amalgamation procedure for joining the clusters. At each step, a new cluster is joined to an existing cluster and their similarity level and distance level are calculated.

Number of clusters

The number of clusters that are formed in each step of the amalgamation process. Before the first step, the number of clusters equals the total number of observations (for cluster observations) or the total number of variables (for cluster variables). In the first step, two clusters are joined to form a new cluster. At each subsequent step, another cluster is joined to an existing cluster to form a new cluster. At the final step, all the observations or variables are combined into a single cluster.

You can enter the number of clusters on the main dialog box to specify the final partition of your data. Your choice of linkage method and distance measure greatly influences the clustering outcome.

Similarity level

The percentage of the minimum distance between clusters at each amalgamation step relative to the maximum inter-observation distance in the data. Similarity, s(ij), between two clusters i and j is given by s(ij) = 100 * [1 - d(ij)) / d(max)], where d(max) is the maximum value in the original distance matrix, D, with entry d(ij) for the distance between i and j.

Interpretation

Use the similarity level for the clusters that are joined at each step to help determine the final groupings for the data. Look for an abrupt change in the similarity level between steps. The step that precedes the abrupt change in similarity may provide a good cut-off point for the final partition. For the final partition, the clusters should have a reasonably high similarity level. You should also use your practical knowledge of the data to determine the final groupings that make the most sense for your application.

For example, the following amalgamation table shows that the similarity level decreases by increments of approximately 3 or less until step 15. The similarity decreases by more than 20 (from 62.0036 to 41.0474) at steps 16 and 17, when the number of clusters changes from 4 to 3. These results indicate that 4 clusters may be sufficient for the final partition. If this grouping makes intuitive sense, then it is probably a good choice.

Amalgamation Steps

StepNumber of
clusters
Similarity
level
Distance
level
Clusters
joined
New clusterNumber
of obs.
in new
cluster
11996.60050.162751316132
21895.46420.217151720172
31795.26480.226696962
41692.91780.339051718173
51590.52960.453391115112
61490.31240.463781219122
71388.24310.5628521422
81288.24310.562855852
91185.97440.6714661063
101083.06390.8108071373
11983.06390.810801312
12881.40390.8902721725
13779.81850.9661761165
14678.75341.0171641243
15566.21121.617602527
16462.00361.819041617
17341.04742.8222914110
18240.17182.8642127210
1910.00004.7873912120
Tip

To visually assess the similarity levels at each step, use the dendrogram.

Distance level

The distance between clusters (using the chosen linkage method) or variables (using the chosen distance measure) that are joined at each step. Minitab calculates the distance level based on the linkage method and the distance measure that you select in the main dialog box.

Interpretation

Use the distance level for the clusters that are joined at each step to help determine the final groupings for the data. Look for an abrupt change in the distance level between steps. The step that precedes the abrupt change in distance may provide a good cut-off point for the final partition. For the final partition, the clusters should have a reasonably small distance level. You should also use your practical knowledge of the data to determine the final groupings that make the most sense for your application.

For example, the following amalgamation table shows that the distance level increases by approximately 0.6 or less for the first 15 steps. However, at steps 16 and 17, when the number of clusters changes from 4 to 3, the distance level increases by more than 1 (from 1.81904 to 2.82229). These results indicate that 4 clusters may be sufficient for the final partition. If this grouping makes intuitive sense, then it is probably a good choice.

Amalgamation Steps

StepNumber of
clusters
Similarity
level
Distance
level
Clusters
joined
New clusterNumber
of obs.
in new
cluster
11996.60050.162751316132
21895.46420.217151720172
31795.26480.226696962
41692.91780.339051718173
51590.52960.453391115112
61490.31240.463781219122
71388.24310.5628521422
81288.24310.562855852
91185.97440.6714661063
101083.06390.8108071373
11983.06390.810801312
12881.40390.8902721725
13779.81850.9661761165
14678.75341.0171641243
15566.21121.617602527
16462.00361.819041617
17341.04742.8222914110
18240.17182.8642127210
1910.00004.7873912120

Clusters joined

The two clusters that are joined to form a new cluster at each step in the amalgamation process.

New cluster

The identification number of the new cluster that is formed at each step in the amalgamation process. The identification number for the new cluster is always the smaller of the identification numbers of the two clusters that are joined. For example, if cluster 2 and cluster 9 are joined, then the new cluster that is formed is called cluster 2.

Number of observations in new cluster

The number of observations in each new cluster at each step in the amalgamation process. In the final step, all the observations are combined into a single cluster. Therefore, the number of observations in the new cluster for the last step equals the total number of observations in the data.

Note

For Cluster Variables, the number of observations is the number of variables in the new cluster.