Governance Practice

What Is Guardrails?

Guardrails is technical and procedural controls that constrain an AI system's behaviour to keep its outputs and actions within acceptable bounds.

Definition

Guardrails — technical and procedural controls that constrain an AI system's behaviour to keep its outputs and actions within acceptable bounds.

AI guardrails operate at multiple layers: input filtering (blocking harmful prompts), output filtering (blocking harmful responses), system prompts (instructions constraining behaviour), and runtime monitoring. For agentic AI, guardrails also include action constraints — limiting what real-world actions the system can take without human approval. Guardrails are a defence-in-depth measure, not a guarantee; they can be bypassed through prompt injection and jailbreaking, which is why they form one layer of a broader control framework.

Source: NIST AI 600-1; OWASP LLM Top 10

Plain-language explanation

Primary source: NIST AI 600-1; OWASP LLM Top 10

Related insights

Australia's AI Governance Gap: What the Regulatory Retreat Means for Enterprise Risk

12 min read · Regulation

Australia's AI Safety Standard: What It Actually Requires and Who It Applies To

10 min read · Australia

Australia's Guidance for AI Adoption (AI6): The Six Essential Practices Replacing the 10 Guardrails

10 min read · Australia

See where you stand on AI governance

Take the free 7-question maturity assessment and get a personalised action plan.

Free assessment — 3 minutes →