Systemic Risk Classification (Article 51)

Learning Objectives

By the end of this chapter, you will be able to:

Define "systemic risk" under the AI Act framework
Apply the computational threshold (10^25 FLOPS) to classify GPAI models
Identify factors that may trigger Commission designation
Execute notification obligations to the AI Office
Monitor models for approaching systemic risk thresholds
Understand the consequences of systemic risk classification

Article 51 introduces a critical distinction within the GPAI framework: models presenting "systemic risk" face significantly enhanced obligations. This tiered approach reflects the EU's recognition that the most capable AI models may pose risks at a societal scale.

Understanding Systemic Risk

Article 51(1) Classification Criteria

A GPAI model is classified as presenting systemic risk if it meets either of two conditions:

Article 51(1)(a): It has high-impact capabilities evaluated on the basis of appropriate technical tools and methodologies, including indicators and benchmarks
Article 51(1)(b): The Commission designates it as such, based on criteria equivalent to those set out in point (a), in accordance with Annex XIII, either ex officio or following a qualified alert from the scientific panel pursuant to Article 90(1)(a)

⚠️ Note: Article 51(1) itself does not define "systemic risk" by listing risk categories. The concept of high-impact capabilities and their potential effects is elaborated in Annex XIII and supporting Recitals, not in Article 51(1) directly.

Annex XIII criteria for assessing high-impact capabilities include:

Criterion	Assessment Factors
Number of parameters	Model size and complexity
Quality and size of dataset	Breadth, depth, and curation of training data
Input/output modalities	Text, image, audio, video, code capabilities
Benchmarks and evaluations	Performance on capability evaluations
Reach and scale	Number of users, integrations, deployments
High-impact capabilities	Assessed dangerous or transformative capabilities

"High-Impact Capabilities" Interpretation

Indicator	Assessment Criteria
Scale of deployment	Millions of users, broad integration
Capability breadth	Wide range of sophisticated tasks
Reasoning ability	Complex multi-step reasoning
Autonomous action	Ability to act with minimal human direction
Knowledge synthesis	Combining information across domains
Code generation	Creating functional software, exploits
Persuasion	Sophisticated content generation
Multimodality	Integration of text, image, audio, video

The 10^25 FLOPS Threshold (Article 51(2))

Computational Threshold Rule

Article 51(2) establishes: A GPAI model shall be presumed to have high-impact capabilities when the cumulative amount of computation used for its training measured in floating point operations (FLOPs) is greater than 10^25.

Understanding the Threshold

Scale Reference	FLOPS	Status
Small models (7B parameters)	~10^22	Well below threshold
Medium models (70B parameters)	~10^23-10^24	Below threshold
Large models (2024 frontier)	~10^24-10^25	Near threshold
Systemic risk threshold	10^25	Presumed systemic risk
Future frontier models	>10^25	Clearly above threshold

What Counts as "Training Compute"?

Included	Not Included
Initial pre-training	Inference at deployment
Fine-tuning (if substantial)	Minor adaptation/prompting
RLHF training	Downstream provider fine-tuning
Multi-stage training	Evaluation and testing
Distillation (from scratch)	User interactions

Calculating Training Compute

Standard Estimation Formula:

Training FLOPS ≈ 6 × Number of Parameters × Number of Training Tokens

Example Calculation:

70B parameter model trained on 2T tokens
6 × 70×10^9 × 2×10^12 = 8.4×10^23 FLOPS
Below 10^25 threshold

💡 Expert Note: The 6× multiplier accounts for forward and backward passes plus optimizer operations. Actual compute may vary based on architecture and training approach.

Commission Designation (Article 51(1)(b))

Alternative Classification Path

Even if a model does not exceed 10^25 FLOPS, the Commission may designate it as presenting systemic risk based on:

Criterion	Assessment Factors
Number of parameters	Model size and complexity
Quality of dataset	Breadth, depth, and curation of training data
Size of dataset	Volume of training data
Input/output modalities	Text, image, audio, video, code capabilities
Benchmarks	Performance on capability evaluations
Reach	Number of users, integrations, deployments
Number of registered users	Business scale and market penetration
High-impact capabilities	Assessed dangerous or transformative capabilities

Designation Process

Step	Actor	Action
1	Scientific Panel	Issues qualified alert identifying potential systemic risk (Article 90(1)(a)), OR Commission acts ex officio
2	Commission	Initiates investigation and evidence gathering based on Annex XIII criteria
3	Commission	Issues designation decision
4	Provider	May challenge designation (Article 52(5))

Challenging a Designation

Article 52(2) provides that providers may, at the time of notification, present sufficiently substantiated arguments that their model does not present systemic risk and should not be classified as such. (Note: Article 52(5) addresses a separate process for reassessment of Commission designations.) Evidence may include:

Independent capability evaluations
Safety testing results
Limitation demonstrations
Use case restrictions
Technical safeguards implemented

Notification Requirements (Article 52)

Mandatory Notification

Article 52(1) requires GPAI providers to notify the Commission:

Trigger	Timeline
Model meets 10^25 FLOPS threshold	Within 2 weeks of meeting threshold
Reasonable grounds to believe threshold will be met	Before training completion
Commission designation received	Immediate acknowledgment required

Notification Content

Article 52(1) requires that the notification include the "information necessary to demonstrate that the relevant requirement has been met." The following table distinguishes between the statutory minimum and recommended best-practice content:

Element	Content Required	Status
Information demonstrating the requirement is met	Evidence that the 10^25 FLOPS threshold has been reached or will be reached	Statutory minimum (Article 52(1))
Provider identification	Legal entity, contact details, authorised representative	Recommended best practice
Model identification	Name, version, release date	Recommended best practice
Training compute	Cumulative FLOPS calculation and methodology	Recommended best practice
Capability assessment	Known capabilities and limitations	Recommended best practice
Intended distribution	Market placement plans	Recommended best practice
Risk assessment	Initial systemic risk assessment	Recommended best practice

Notification Template

GPAI SYSTEMIC RISK NOTIFICATION
(Article 52, Regulation (EU) 2024/1689)

1. PROVIDER INFORMATION
   [Legal name, address, contact, authorised representative]

2. MODEL IDENTIFICATION
   [Model name, version, planned release date]

3. TRAINING COMPUTE
   Total FLOPS: [X.XX × 10^YY]
   Calculation methodology: [Description]
   Training completion date: [Date]

4. CAPABILITY ASSESSMENT
   [Summary of model capabilities]
   [Known limitations]

5. RISK ASSESSMENT
   [Identified systemic risks]
   [Planned mitigations]

6. DISTRIBUTION PLANS
   [Intended market placement approach]
   [Timeline]

Submitted by: [Name, title]
Date: [Date]

Consequences of Systemic Risk Classification

Enhanced Obligations

Baseline GPAI	+ Systemic Risk Additions
Technical documentation	+ Model evaluation and adversarial testing
Downstream information	+ Systemic risk assessment at Union level
Copyright policy	+ Incident tracking and reporting
Training data summary	+ Cybersecurity protection

Regulatory Scrutiny

Aspect	Standard GPAI	Systemic Risk GPAI
AI Office oversight	General	Enhanced monitoring
Evaluation requests	Ad hoc	Regular requirements possible
Incident reporting	Via downstream providers	Direct to AI Office
Enforcement focus	Documentation	Active risk management

Monitoring Approaching Threshold

Internal Monitoring Framework

Metric	Monitoring Frequency	Threshold Alert
Training FLOPS accumulated	Daily during training	10^24 (approaching)
Parameter count	Per training iteration	Planning stage
Dataset size growth	Weekly	Major expansions
Capability evaluations	Per checkpoint	Significant capability gains
User/deployment scale	Monthly	Rapid growth

Pre-Threshold Preparation

For models approaching 10^25 FLOPS:

Begin Article 55 compliance preparation
Establish adversarial testing programme
Develop systemic risk assessment methodology
Implement enhanced incident tracking
Review cybersecurity measures
Prepare AI Office notification
Engage Scientific Panel proactively

Systemic Risk Classification