Search

# k-Nearest Neighbours (KNN)

Last updated Sep 23, 2022 Edit Source

To classify an example, we find the $k$ examples closest to the example and take the mode of the $k$ examples.

Works based off of the assumption that similar features are likely to have similar labels

As $k$ grows, training error increases and approximation error decreases.

We measure distance using the “norm” between feature vectors. The most common norm is the L2-Norm or Euclidean Norm

## # Performance

• $O(1)$ training (just relies on training data)
• $O(nd)$ predictions ($O(d)$ distance calculations for all $n$ examples)
• $O(nd)$ space to store each training example in memory
• This is non-parametric

KNN can suck in high dimensions (see: curse of dimensionality)