Binary Classification

Set $y_{i} = + 1$ for one class (“important”)
Set $y_{i} = - 1$ for the other class (“not important”)
To predict, we look at whether $w^{T} x_{i}$ is closer to +1 or -1
- $\overset{y}{^}_{i} = sign (w^{T} x_{i})$

Least squares error may overpenalize. Only thing we care about is the sign, not how far away it is from the decision boundary.

Could we instead minimize number of classification errors? This is called the 0-1 loss function: you either get the classification wrong (1) or right (0).

L (i, j) = {01 i = j i \neq = j

Illustration above is if $y_{i} = 1$ . Flip for $y_{i} = - 1$

Unfortunately, 0-1 Loss is non-convex. We can, once again, use a convex approximation which is called the Hinge loss:

$L (i, j) = max (0, 1 - y_{i} w^{T} x_{i})$

Perceptron

Only works for linearly-separable data

Searches for a $w$ such that $sign (w^{T} x_{i}) = y_{i}, \forall i$
Intuition is that you search for the ledge
Start with $w^{0} = 0$
Classify each example until we reach a mistake
- Then, update $w$ to $w^{t + 1} = w^{t} + y_{i} x_{i}$
If a perfect classifier exists, this algorithm finds one in finite number of steps

jzhao.xyz

Recent Writing

2024: Centering

Taste is a guide for what is worthwhile

Agentic Computing

Building a BFT JSON CRDT

Recent Notes

TrueTime

Concurrency control