Interpret the key results for Cluster K-Means

Complete the following steps to interpret a cluster k-means analysis. Key output includes the observations and the variability measures for the clusters in the final partition.

In This Topic

Step 1: Examine the final groupings
Step 2: Assess the variability within each cluster

Step 1: Examine the final groupings

Examine the final groupings to see whether the clusters in the final partition make intuitive sense, based on the initial partition you specified. Check that the number of observations in each cluster satisfies your grouping objectives. If one cluster contains too few or too many observations, you may want to re-run the analysis using another initial partition.

Method

Number of clusters	3
Standardized variables	Yes

Final Partition

	Number of observations	Within cluster sum of squares	Average distance from centroid	Maximum distance from centroid

Cluster1	4	1.593	0.578	0.884
Cluster2	8	8.736	0.964	1.656
Cluster3	10	12.921	1.093	1.463

Cluster Centroids

Variable	Cluster1	Cluster2	Cluster3	Grand centroid

Clients	1.2318	0.5225	-0.9108	0.0000
Rate of Return	1.2942	0.2217	-0.6950	0.0000
Sales	1.1866	0.5157	-0.8872	0.0000
Years	1.2030	0.5479	-0.9195	0.0000

Distances Between Cluster Centroids

	Cluster1	Cluster2	Cluster3

Cluster1	0.0000	1.5915	4.1658
Cluster2	1.5915	0.0000	2.6488
Cluster3	4.1658	2.6488	0.0000

Key Results: Final partition

In these results, Minitab clusters data for 22 companies into 3 clusters based on the initial partition that was specified. Cluster 1 contains 4 observations and represents larger, established companies. Cluster 2 contains 8 observations and represents mid-growth companies. Cluster 3 contains 10 observations and represents young companies. A business analyst believes that these final groupings are adequate for the data.

Note

To see which cluster each observation belongs to, you must enter a storage column when you perform the analysis. Minitab stores the cluster membership for each observation in a column in the worksheet.

Step 2: Assess the variability within each cluster

Examine the variability of the observations within each cluster, using the distance from centroid measures. Clusters with higher values exhibit greater variability of the observations within the cluster. If the difference in variability between clusters is too high, you may want to re-run the analysis using another initial partition.

Method

Number of clusters	3
Standardized variables	Yes

Final Partition

	Number of observations	Within cluster sum of squares	Average distance from centroid	Maximum distance from centroid

Cluster1	4	1.593	0.578	0.884
Cluster2	8	8.736	0.964	1.656
Cluster3	10	12.921	1.093	1.463

Cluster Centroids

Variable	Cluster1	Cluster2	Cluster3	Grand centroid

Clients	1.2318	0.5225	-0.9108	0.0000
Rate of Return	1.2942	0.2217	-0.6950	0.0000
Sales	1.1866	0.5157	-0.8872	0.0000
Years	1.2030	0.5479	-0.9195	0.0000

Distances Between Cluster Centroids

	Cluster1	Cluster2	Cluster3

Cluster1	0.0000	1.5915	4.1658
Cluster2	1.5915	0.0000	2.6488
Cluster3	4.1658	2.6488	0.0000

Key Results: Average distance from centroid

In these results, the average distance from centroid is lowest for Cluster 1 (0.578) and highest for Cluster 3 (1.093). This indicates that Cluster 1 has the least variability and Cluster 3 has the most variability. However, Cluster 1 has the fewest observations (4) and Cluster 3 has the most observations (10), which may partly explain the difference in variability.