- Clusters are defined by “dense” regions.
- Examples in non-dense regions don’t get clustered
- Clusters can be non-convex

It is **non-parametric** (there is no fixed number of clusters $k$)

## DBSCAN

Main idea: merge all neighbouring core points to form clusters

Defines

- Epsilon ($ϵ$): distance we use to decide if another point is a “neighbour”
- MinNeighbours: number of neighbours needed to say a region is “dense”
- If you have at least minNeighbours “neighbours”, you are called a “core” point

Pseudocode;

- For each example $x_{i}$ :
- If $x_{i}$ is already assigned to a cluster, do nothing.
- Test whether $x_{i}$ is a ‘core’ point ($≥$ minNeighbours examples within $ϵ$).
- If $x_{i}$ is not core point, do nothing (this could be an outlier).
- If $x_{i}$ is a core point, make a new cluster and call the “expand cluster” function.
- Assign to this cluster all $x_{j}$ within distance $ϵ$ of core point $x_{i}$ to this cluster.
- For each new “core” point found, call “expand cluster” (recursively).

Issues

- Ambiguity of “non-core” boundary points
- Sensitive to the choice of $ϵ$ and minNeighbours