To classify an example, we find the examples closest to the example and take the mode of the examples.

Works based off of the assumption that similar features are likely to have similar labels

Effects on fundamental tradeoff:

  • As grows, training error increases and approximation error decreases.
  • As grows, model complexity increases

We measure distance using the “norm” between feature vectors. The most common norm is the L2-Norm or Euclidean Norm


  • training (just relies on training data)
  • predictions ( distance calculations for all examples)
  • space to store each training example in memory
    • This is non-parametric

KNN can suck in high dimensions (see: curse of dimensionality)