Governance Concept

What Is AI Alignment?

AI Alignment is the problem of ensuring an AI system pursues the goals its designers and society actually intend, rather than unintended proxies.

Definition

AI Alignment — the problem of ensuring an AI system pursues the goals its designers and society actually intend, rather than unintended proxies.

Alignment is about making sure a capable system does what we mean, not just what we literally asked. Misalignment can be subtle — a model that optimises a metric in ways that defeat the metric's purpose — and it becomes more consequential as systems become more capable and autonomous. It is a central concern in AI safety research and in the safety frameworks of frontier developers.

Source: AI safety research literature

Plain-language explanation

Primary source: AI safety research literature

See where you stand on AI governance

Take the free 7-question maturity assessment and get a personalised action plan.

Free assessment — 3 minutes →

What Is AI Alignment?

Plain-language explanation

Related terms