What Is Model Drift? Why It Happens and Why It Matters for AI Governance

Model drift is the degradation of an AI model's performance over time as the world changes. It is one of the most common causes of AI governance failure in production — and most organisations have no monitoring for it.

Key Takeaways

Model drift occurs when an AI model's real-world performance degrades over time because the statistical distribution of real-world data diverges from the training data distribution. This is not a bug — it is an inevitable consequence of deploying AI in a changing world.
There are two types of drift: data drift (the inputs to the model change) and concept drift (the relationship between inputs and the correct output changes). Both cause performance degradation; concept drift is typically more severe.
Model drift can cause governance failures without triggering obvious alerts — a model may continue to produce outputs within technical tolerance while producing increasingly biased, inaccurate, or unfair results in production.
EU AI Act Article 72 requires providers of high-risk AI systems to establish a post-market monitoring system; Article 9 requires that the risk management system be informed by post-market data — monitoring for model drift is a compliance requirement, not merely a best practice.
The practical monitoring programme for model drift: define performance metrics at deployment, set thresholds that trigger review, monitor in production continuously, review performance at defined intervals, and establish a retraining or replacement protocol when drift is detected.

"Nur zu Informationszwecken. Dieser Artikel stellt keine rechtliche, regulatorische, finanzielle oder professionelle Beratung dar. Konsultieren Sie einen qualifizierten Spezialisten für spezifische Beratung."

What is model drift and why does it matter for AI governance?

Model drift is the degradation of an AI model's accuracy and reliability over time as the real-world conditions it was trained on change. There are three types: data drift (input distributions shift), concept drift (the relationship between inputs and outputs changes), and label drift (the meaning of target variables evolves). Model drift is one of the most common causes of AI governance failure in production — and most organisations have no monitoring for it.

Model drift is not a failure of the model — it is an expected, inevitable property of any model deployed in a changing environment. The governance question is not whether drift will occur, but whether you will detect it before it causes harm.

Types of drift

Data drift (covariate shift). The statistical properties of the input data change. Customer demographics shift, product mix changes, market conditions evolve. The model receives inputs it wasn't designed for.

Concept drift. The relationship between inputs and outputs changes. What constituted fraud in 2024 looks different in 2026. Customer satisfaction drivers evolve. The "right answer" the model should produce has changed even though the inputs look similar.

Label drift. The target variable's distribution changes. Default rates shift across economic cycles. Claim frequencies change. The outcome the model predicts becomes more or less common.

Why governance must address drift

Regulatory expectations now explicitly include ongoing monitoring. APRA's 30 April 2026 letter specified "continuous validation" rather than point-in-time assurance. The Federal Reserve's SR 26-2 (17 April 2026) replaces annual revalidation with risk-based monitoring. The ECB's July 2025 Supervisory Guide on Internal Models extends MRM expectations — including drift monitoring — to all models including AI/ML. ISO/IEC 42001 expects ongoing performance management. The EU AI Act requires post-market monitoring for high-risk AI systems.

Undetected drift creates escalating risk: degrading accuracy leads to worse decisions, which leads to customer harm, regulatory exposure, and financial loss — all while the model continues to operate and the organisation believes it's performing as designed.

Detecting drift

Statistical monitoring. Track distributional characteristics of inputs and outputs over time. Population Stability Index (PSI), Kolmogorov-Smirnov tests, and Jensen-Shannon divergence are common statistical measures. Set thresholds that trigger review when exceeded.

Performance monitoring. Track the model's prediction accuracy against actual outcomes. Accuracy, precision, recall, F1 score, AUC-ROC for classification; MAE, RMSE for regression. Compare current performance against the baseline established at validation.

Demographic monitoring. Track performance across demographic groups separately. Drift may affect some groups more than others — a credit model that remains accurate overall but degrades for a specific demographic group creates bias and regulatory exposure.

Responding to drift

When drift is detected: assess whether it materially affects decisions; retrain the model on current data if appropriate; revalidate against updated benchmarks; document the drift event, assessment, and remediation in the AI risk register; report to the AI governance committee and board if the drift affected a high-risk system or produced adverse outcomes.