The types of AI bias

Historical bias in training data: AI models learn from historical data that reflects historical discrimination. If historical lending data shows lower approval rates for certain demographic groups — not because those groups are less creditworthy but because they faced discriminatory lending practices — an AI trained on that data will learn to reproduce those patterns. The model is not explicitly discriminating; it is faithfully learning the patterns in the data. But those patterns encode historical injustice, and the model perpetuates it.

Representation bias: training datasets that underrepresent certain groups produce models that perform worse for those groups. Facial recognition systems trained primarily on light-skinned male faces perform significantly worse on dark-skinned female faces. Medical AI trained on datasets skewed toward particular demographics produces less accurate diagnoses for underrepresented groups. Representation bias is particularly acute for AI serving populations that have historically been excluded from research and data collection.

Proxy variable bias: AI models that do not explicitly use protected characteristics may use proxy variables that are correlated with those characteristics. Postcode can be a proxy for race in markets with residential segregation. Educational institution can be a proxy for socioeconomic background. Even without any intention to discriminate, a model that uses these variables will reproduce the disparate outcomes they proxy. Proxy variable bias is particularly difficult to detect because the model appears neutral — it is only by testing outcomes across demographic groups that the discrimination becomes visible.

Feedback loop bias: AI systems that learn from their own outputs create feedback loops that can amplify initial biases. A recidivism prediction model that predicts high risk for certain groups influences parole decisions; the resulting incarceration confirms the prediction in subsequent training data; the model becomes more confident in its (biased) predictions. Feedback loop bias is particularly concerning for AI systems that continuously update from production data.

The fairness metric decision

Choosing a fairness metric is a values decision disguised as a technical decision. The most commonly used metrics: demographic parity (equal positive outcome rates across groups — equal loan approval rates regardless of demographic group); equalised odds (equal true positive and false positive rates across groups — equal probability of being correctly approved and incorrectly rejected); calibration (among people who receive a given score, equal probability of the outcome regardless of group — if 40% of low-risk individuals default, that should be true across groups). These metrics often cannot all be satisfied simultaneously — there are mathematical proofs showing that in most realistic scenarios, satisfying one fairness metric makes it impossible to satisfy another. The choice of which metric to optimise for is not a technical decision — it is a decision about which dimension of fairness the organisation prioritises.