Hierarchical clustering produces a tree of clusterings
- Each node in the tree splits the data into 2 or more clusters.
- Much more information than using a fixed clustering.
- Often have individual data points as leaves.
Often applied in phylogenetics
Agglomerative Clustering (Bottom up)
Requires a “distance” measure between two clusters.
Cluster distance measures
-
Distance between closest members of and . Also called single-link clustering:
-
Distance between farthest members of and . Also called complete-link clustering:
-
Average distance between members of and . Also called group average clustering:
-
Starts with each point in its own cluster.
-
Each step merges the two “closest” clusters.
-
Stop with one big cluster that has all points.
Naive implementation cost is
Divisive Clustering (Top-down)
Start with all examples in one cluster, then start dividing. (e.g., run K-means on a cluster, then run again on resulting clusters)
Biclustering
Cluster the training examples and features. Helps to figure out the ‘why’ on why things are clustered together
- Run clustering method on
- Run clustering method on
A dendrogram describes the hierarchy of clusters generated by the clustering methods.