Set of techniques to find components that belong together.

Note: Grouping is how the human visual system perceives things and clustering is the actual algorithm itself.

We want to assign examples to “groups”


  • K-means (most popular)
  • density-based clustering
  • Ensemble Clustering
    • Like random forest but for voting for clustering
    • This is problematic because of the label switching problem — we can get clustering with permuted labels on each initialisation
      • Don’t vote on what specific class each cluster is
      • Instead, vote on whether points are in the same cluster (label independent)
      • Then, come up with labels after voting
  • hierarchical clustering