Bias

Bias: a slant or preference

“We use the term bias to refer to computer systems that systematically and unfairly discriminate against certain individuals or groups of individuals in favour of others… A system discriminates unfairly if it denies an opportunity or a good or if it assigns an undesirable outcome to an individual or group of individuals on grounds that are unreasonable or inappropriate” (Friedman and Nissenbaum, 1996)

More detailed post on Bias in AI. Related readings: Design Justice, To Live in their Utopia, Social Bias in Information Retrieval, Algorithms of Oppression, data distributions

Captchas

How do you distinguish between human and non-human without discriminating against certain types of people (e.g. ethnicity, cultural background)? How does one prove their humanity without betraying anything else about them?

“What is the universal human quality that can be demonstrated to a machine, but that no machine can mimic? What is it to be human?”

“You need something that’s easy for an average human, it shouldn’t be bound to a specific subgroup of people, and it should be hard for computers at the same time. That’s very limiting in what you can actually do. And it has to be something that a human can do fast, and isn’t too annoying.”

Possibility of reverse CAPTCHAs where you can only pass if you get it wrong in the ‘right’ way? (e.g. optical illusions)

3 groups of study

from Design Justice and Friedman

Preexisting Bias: bias that exists in broader society, culture, and/or institutions is reproduced in the computer system, either intentionally or unintentionally, by systems developers. (e.g. notions of quality and authority bias embedded in the web content itself)
Technical Bias: some underlying aspect of the technology reproduces bias (e.g. design of crawlers/aggregate/surfacing algorithms for content, ranking features)
Emergent Bias: may not have been biased given its original context of use or original user base but comes to exhibit bias when the context shifts or when new users arrive (e.g. responses to spam, content moderation, search suggestions)

Cathy O’Neil: algorithms are “opinions embedded in code” — artifacts do indeed have politics

Baeza-Yates

Activity Bias: who contributes to the data? who is seen by these algorithms?
Data Bias: is the underlying data biased/non-representative?
Sampling Bias: what data is used by algorithms?
Algorithmic Bias: what gets shown to users?
Interaction Bias: how do people use the algorithms?
Self-selection Bias: who uses these algorithms?
Second-order Bias: digital trace data, how do our data-residues

Forbidden Rates

Coined by Tamar Gendler

We do not live in perfectly egalitarian societies, and race, gender, class and other identities can significantly affect how our lives work out.

Now suppose you’re at a reception for engineers and their spouses, and you’re introduced to a male–female couple about whom you know next to nothing. Odds are, he’s the engineer. But if you have anti-sexist instincts, you may feel pulled towards keeping an entirely open mind about which of these two strangers is the engineer, rather than allowing your statistical knowledge to incline you towards the man. If you do ‘slip’ into assuming the man to be the engineer, and this turns out to be a mistake, you’re likely to be more embarrassed than you would be had you wrongly assumed the couple to live in the local area, on the grounds that most guests at the reception live locally.

jzhao.xyz

Selected Writing

Beating Myers algorithm on large diffs

On driving

Building race telemetry for a '92 Honda Accord

Taste is a guide for what is worthwhile

Recent Notes

Reinforcement Learning

Step changes from steady progress