jzhao.xyz

Recent Writing

2024: Centering
Dec 23, 2024
Taste is a guide for what is worthwhile
Jan 14, 2024
Agentic Computing
Nov 29, 2022
Building a BFT JSON CRDT
Nov 16, 2022

See 21 more →

Recent Notes

TrueTime
May 26, 2025
Concurrency control
May 26, 2025

See 735 more →

Density-based clustering

Sep 28, 20221 min read

seed
CPSC340

Clusters are defined by “dense” regions.
Examples in non-dense regions don’t get clustered
Clusters can be non-convex

It is non-parametric (there is no fixed number of clusters $k$ )

DBSCAN

Main idea: merge all neighbouring core points to form clusters

Defines

Epsilon ( $ϵ$ ): distance we use to decide if another point is a “neighbour”
MinNeighbours: number of neighbours needed to say a region is “dense”
- If you have at least minNeighbours “neighbours”, you are called a “core” point

Pseudocode;

For each example $x_{i}$ :
- If $x_{i}$ is already assigned to a cluster, do nothing.
- Test whether $x_{i}$ is a ‘core’ point ( $\geq$ minNeighbours examples within $ϵ$ ).
  - If $x_{i}$ is not core point, do nothing (this could be an outlier).
  - If $x_{i}$ is a core point, make a new cluster and call the “expand cluster” function.
    - Assign to this cluster all $x_{j}$ within distance $ϵ$ of core point $x_{i}$ to this cluster.
    - For each new “core” point found, call “expand cluster” (recursively).

Issues

Ambiguity of “non-core” boundary points
Sensitive to the choice of $ϵ$ and minNeighbours

Recent Writing

2024: Centering
Dec 23, 2024
Taste is a guide for what is worthwhile
Jan 14, 2024
Agentic Computing
Nov 29, 2022
Building a BFT JSON CRDT
Nov 16, 2022

See 21 more →

Recent Notes

TrueTime
May 26, 2025
Concurrency control
May 26, 2025

See 735 more →

Graph View

Backlinks

Clustering

Created with Quartz v4.5.1 © 2025

GitHub
Twitter