AIRiskAware
AI Governance Glossary
Governance Concept

What Is AI Alignment?

AI Alignment is the problem of ensuring an AI system pursues the goals its designers and society actually intend, rather than unintended proxies.

Definition

AI Alignmentthe problem of ensuring an AI system pursues the goals its designers and society actually intend, rather than unintended proxies.

Alignment is about making sure a capable system does what we mean, not just what we literally asked. Misalignment can be subtle — a model that optimises a metric in ways that defeat the metric's purpose — and it becomes more consequential as systems become more capable and autonomous. It is a central concern in AI safety research and in the safety frameworks of frontier developers.

Source: AI safety research literature

Plain-language explanation

Alignment is about making sure a capable system does what we mean, not just what we literally asked. Misalignment can be subtle — a model that optimises a metric in ways that defeat the metric's purpose — and it becomes more consequential as systems become more capable and autonomous. It is a central concern in AI safety research and in the safety frameworks of frontier developers.

Primary source: AI safety research literature

Related terms

AI Safety Responsible Scaling Policy (RSP) AI Red Teaming Model Evaluation

See where you stand on AI governance

Take the free 7-question maturity assessment and get a personalised action plan.

Free assessment — 3 minutes →