Interpret the key results for Cluster K-Means

Complete the following steps to interpret a cluster k-means analysis. Key output includes the observations and the variability measures for the clusters in the final partition.

Step 1: Examine the final groupings

Examine the final groupings to see whether the clusters in the final partition make intuitive sense, based on the initial partition you specified. Check that the number of observations in each cluster satisfies your grouping objectives. If one cluster contains too few or too many observations, you may want to re-run the analysis using another initial partition.

K-means Cluster Analysis: Clients, Rate of Return, Sales, Years

Method Number of clusters 3 Standardized variables Yes
Final Partition Within Average Maximum cluster distance distance Number of sum of from from observations squares centroid centroid Cluster1 4 1.593 0.578 0.884 Cluster2 8 8.736 0.964 1.656 Cluster3 10 12.921 1.093 1.463
Cluster Centroids Grand Variable Cluster1 Cluster2 Cluster3 centroid Clients 1.2318 0.5225 -0.9108 0.0000 Rate of Return 1.2942 0.2217 -0.6950 0.0000 Sales 1.1866 0.5157 -0.8872 0.0000 Years 1.2030 0.5479 -0.9195 0.0000
Distances Between Cluster Centroids Cluster1 Cluster2 Cluster3 Cluster1 0.0000 1.5915 4.1658 Cluster2 1.5915 0.0000 2.6488 Cluster3 4.1658 2.6488 0.0000
Key Results: Final partition

In these results, Minitab clusters data for 22 companies into 3 clusters based on the initial partition that was specified. Cluster 1 contains 4 observations and represents larger, established companies. Cluster 2 contains 8 observations and represents mid-growth companies. Cluster 3 contains 10 observations and represents young companies. A business analyst believes that these final groupings are adequate for the data.

Note

To see which cluster each observation belongs to, you must enter a storage column when you perform the analysis. Minitab stores the cluster membership for each observation in a column in the worksheet.

Step 2: Assess the variability within each cluster

Examine the variability of the observations within each cluster, using the distance from centroid measures. Clusters with higher values exhibit greater variability of the observations within the cluster. If the difference in variability between clusters is too high, you may want to re-run the analysis using another initial partition.

K-means Cluster Analysis: Clients, Rate of Return, Sales, Years

Method Number of clusters 3 Standardized variables Yes
Final Partition Within Average Maximum cluster distance distance Number of sum of from from observations squares centroid centroid Cluster1 4 1.593 0.578 0.884 Cluster2 8 8.736 0.964 1.656 Cluster3 10 12.921 1.093 1.463
Cluster Centroids Grand Variable Cluster1 Cluster2 Cluster3 centroid Clients 1.2318 0.5225 -0.9108 0.0000 Rate of Return 1.2942 0.2217 -0.6950 0.0000 Sales 1.1866 0.5157 -0.8872 0.0000 Years 1.2030 0.5479 -0.9195 0.0000
Distances Between Cluster Centroids Cluster1 Cluster2 Cluster3 Cluster1 0.0000 1.5915 4.1658 Cluster2 1.5915 0.0000 2.6488 Cluster3 4.1658 2.6488 0.0000
Key Results: Average distance from centroid

In these results, the average distance from centroid is lowest for Cluster 1 (0.578) and highest for Cluster 3 (1.093). This indicates that Cluster 1 has the least variability and Cluster 3 has the most variability. However, Cluster 1 has the fewest observations (4) and Cluster 3 has the most observations (10), which may partly explain the difference in variability.

By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy