AI Incident Response: What to Do When Your AI System Fails or Causes Harm

AI systems fail differently from conventional software — systematic bias, model drift, hallucination. When they do, the response has legal, regulatory, and reputational dimensions that standard incident response playbooks do not address.

Key Takeaways

AI incidents can affect thousands simultaneously through systematic bias, be difficult to detect (gradual model drift), and have retrospective implications for decisions already made with flawed AI.
EU AI Act Article 73 requires providers and deployers of high-risk AI to notify national market surveillance authorities of serious incidents — within 15 days (general), within 10 days (death), within 2 days (widespread infringement or critical infrastructure disruption).
APRA CPS 230 requires notification of material operational risk incidents within 72 hours; disruptions to critical operations outside tolerance within 24 hours. GDPR requires DPA notification within 72 hours where AI incidents involve personal data breaches.
The first 24 hours: containment (limiting further harm), evidence preservation (logs, model versions), initial scope assessment (individuals affected, nature of harm), and notification decisions.
Retrospective remediation — addressing harm from decisions made before the failure was identified — requires planning before an incident occurs. Design AI systems to be remediable.
Post-incident, EU AI Act high-risk AI requires documented investigation, root cause analysis, corrective action, and updated conformity assessment. Regulators expect evidence that incidents produce systemic learning.

"情報提供のみを目的としています。この記事は法律、規制、財務または専門的なアドバイスを構成するものではありません。具体的なアドバイスについては、資格を持つ専門家にご相談ください。"

What makes AI incidents different from conventional IT incidents

AI systems fail in ways that conventional IT systems do not, and incident response frameworks designed for IT outages or cybersecurity breaches are not adequate without modification. Three characteristics of AI failures distinguish them from conventional incidents.

First, AI failures are often statistical and gradual rather than binary and immediate. A conventional system either works or it does not. An AI model can degrade progressively — producing increasingly biased or inaccurate outputs over time due to data drift, distributional shift, or model decay — without any single obvious failure event. By the time the failure is identified, it may have affected thousands or millions of decisions. The identification problem is therefore primary: organisations must detect AI failures before they cause significant harm, not just respond after a clear incident has occurred.

Second, AI failures can be caused by inputs rather than system malfunction. Prompt injection attacks, adversarial inputs designed to manipulate model outputs, and data poisoning in training pipelines are attack vectors that have no equivalent in conventional IT systems. An AI incident response framework must account for the possibility that an apparent system failure is actually a deliberate attack on the AI layer.

Third, the effects of AI failures can be difficult to reverse. A credit decision made by a biased model, a medical recommendation that caused harm, or a hire that was influenced by a discriminatory algorithm cannot simply be rolled back like a database transaction. Incident response for AI must include remediation planning for the downstream effects of decisions made before the failure was identified — not just restoration of system function.

Regulatory notification obligations triggered by AI incidents

Several regulatory frameworks require notification of AI-related incidents to regulators, affected individuals, or both. The notification obligations that apply depend on the jurisdiction and the nature of the incident.

Australia — APRA CPS 230. Material operational risk incidents at APRA-regulated entities must be notified to APRA within 72 hours of the entity becoming aware. Disruptions to critical operations that fall outside the entity's tolerance levels must be notified within 24 hours. An AI system producing systematically wrong outputs in a critical operation (credit, claims, fraud detection) is an operational risk incident that may trigger these obligations. APRA's April 2026 letter specifically mentioned AI system failures among the attack pathways and risk categories requiring an entity's incident management framework to address.

Australia — Notifiable Data Breaches. If an AI incident results in unauthorised access to or disclosure of personal information held by an APP entity, and it is likely to result in serious harm to affected individuals, it triggers the Notifiable Data Breach (NDB) scheme under Part IIIC of the Privacy Act 1988. Notification must be made to the OAIC and to affected individuals as soon as practicable. An AI system breach that exposes training data, inference inputs, or stored personal information is a potential NDB trigger.

European Union — EU AI Act. Deployers of high-risk AI systems must report serious incidents to market surveillance authorities without undue delay. Under Article 73, the notification timelines are: within 15 days for general serious incidents; within 10 days where the incident results in a person's death; and within 2 days for widespread infringement or risks to critical infrastructure. A serious incident is one that results in death, serious health harm, property damage, or significant disruption to essential services. Providers of high-risk AI systems who become aware of a serious incident must also report it to the market surveillance authority of the Member State where the incident occurred.

European Union — GDPR. Where an AI incident constitutes a personal data breach — including unauthorised access to inputs sent to an AI model, or exposure of training data — notification to the supervisory authority is required within 72 hours of becoming aware (Article 33 GDPR). If the breach is likely to result in a high risk to individuals, the individuals themselves must also be notified without undue delay (Article 34).

A practical AI incident response framework

An effective AI incident response framework has five phases: detection, classification, containment, investigation, and remediation. Each requires AI-specific adaptations to conventional incident response procedures.

Detection. AI incidents must be detected before they cause significant harm. Detection mechanisms should include: real-time performance monitoring (accuracy, confidence scores, output distributions) with automated alerting on deviation from baseline; user feedback channels specifically designed to surface AI errors and unexpected outputs; regular output sampling and human review on a rolling basis; and adversarial testing including prompt injection probing for LLM-based systems. APRA's April 2026 letter found that many entities relied on point-in-time assurance methods that are inadequate for detecting gradual AI degradation.

Classification. When a potential AI incident is identified, it must be classified by severity and regulatory notification obligation. Classification criteria should cover: the number of decisions or individuals affected; the nature and reversibility of harm; whether the failure involves personal information; whether it affects a critical operation; and whether it may qualify as a serious incident under applicable regulation. Classification determines the response timeline and escalation path.

Containment. Containment for AI incidents differs from IT incident containment. Options include: suspension of the affected AI system and reversion to manual processes; filtering of outputs from the affected system pending investigation; rate-limiting or scoping of the system to lower-risk use cases while investigation continues; and isolation of the system from downstream data flows. Every AI system deployed in a material process should have a documented, tested containment procedure — including a designated authority to execute it.

Investigation. AI incident investigation must determine root cause, scope, and downstream effect. This includes: identifying the triggering event (model drift, data quality failure, adversarial input, configuration change, or vendor model update); quantifying the affected decision population and time window; assessing whether affected decisions caused harm to individuals or to the organisation; and determining whether regulatory notification obligations have been triggered and at what threshold.

Remediation. Remediation has two components: technical remediation (retraining, reconfiguration, or replacement of the affected system) and downstream remediation (addressing the effects of decisions made during the failure period). Downstream remediation may include reviewing and overriding automated decisions, contacting and compensating affected individuals, and reporting to regulators. Post-incident, a formal lessons-learned review should update detection mechanisms, containment procedures, and classification criteria.

Key Takeaways

What makes AI incidents different from conventional IT incidents

Regulatory notification obligations triggered by AI incidents

A practical AI incident response framework

Related reading