Governance Practice

What Is Reinforcement Learning?

Reinforcement Learning is a machine-learning paradigm in which an agent learns to make decisions by taking actions in an environment and receiving rewards or penalties that it seeks to maximise over time.

Definition

Reinforcement Learning — a machine-learning paradigm in which an agent learns to make decisions by taking actions in an environment and receiving rewards or penalties that it seeks to maximise over time.

Unlike supervised learning, which learns from labelled examples, reinforcement learning learns from trial-and-error feedback. It underpins reinforcement learning from human feedback (RLHF), used to align large language models to human preferences. From a governance view, reward design is a key risk: poorly specified rewards can produce unintended or unsafe behaviour ("reward hacking").

Source: ISO/IEC 22989:2022 (AI concepts and terminology)

Plain-language explanation

Primary source: ISO/IEC 22989:2022 (AI concepts and terminology)

See where you stand on AI governance

Take the free 7-question maturity assessment and get a personalised action plan.

Free assessment — 3 minutes →

What Is Reinforcement Learning?

Plain-language explanation

Related terms