Cluster K-Means

K-means clustering begins with a grouping of observations into a predefined number of clusters. Minitab then uses the following procedure to form the clusters:

  • Minitab evaluates each observation, moving it into the nearest cluster. The nearest cluster is the one which has the smallest Euclidean distance between the observation and the centroid of the cluster.
  • When a cluster changes, by losing or gaining an observation, Minitab recalculates the cluster centroid.
  • This process is repeated until no more observations can be moved into a different cluster. At this point, all observations are in their nearest cluster according to the criterion listed above.

Unlike hierarchical clustering of observations, two observations initially joined together by the cluster k-means procedure can later be split into separate clusters.

The k-means procedure works best when you provide good starting points for the clusters.