Real-World Testing (Article 60)

Learning Objectives

By the end of this chapter, you will be able to:

Understand when and how real-world testing is permitted under Article 60
Design testing plans that meet regulatory requirements
Implement informed consent procedures compliant with the AI Act
Establish monitoring and safeguard frameworks for live testing
Document real-world testing to support conformity assessment

Introduction: Testing in the Real World

Laboratory testing and simulations have limits. At some point, AI systems must be tested in real-world conditions to validate performance. Article 60 provides a framework for such testing—balancing the need for realistic validation against the protection of affected persons.

Expert Insight

Real-world testing is where theory meets practice. The AI Act doesn't prohibit it—it requires that you do it responsibly. A well-designed testing program actually strengthens your conformity assessment by providing real-world performance evidence.

Legal Framework (Article 60)

When Real-World Testing Applies

Condition	Article 60 Requirement
System type	High-risk AI systems under Annex III
Lifecycle stage	Before placing on market or putting into service
Purpose	Testing performance under real-world conditions
Approval pathway	Either within a sandbox OR with approved testing plan
Safeguards	Subject to specific protections for affected persons

Article 60 vs. Sandbox Testing

Aspect	Regulatory Sandbox (Articles 57-58)	Real-World Testing (Article 60)
Scope	Broader development and validation	Focused performance testing
Duration	Longer (6-24 months typical)	Shorter, specific testing period
Regulatory engagement	Continuous supervision	Plan approval + monitoring
Best for	Novel systems, compliance uncertainty	Validating known system performance
Relationship	Can include real-world testing	Can be standalone or within sandbox

Prerequisites for Real-World Testing

Mandatory Requirements

Requirement	Article Reference	What It Means
Testing plan	Article 60(4)	Detailed plan approved by market surveillance authority
Informed consent	Article 60(4)(i)	Freely-given informed consent from test subjects
Monitoring	Article 60(4)(a)	Effective oversight with ability to intervene
Reversibility	Article 60(4)(d)	AI decisions can be disregarded or reversed
Risk mitigation	Article 60(4)(b)	Safeguards to prevent harm
No significant risk	Article 60(4)(b)	Testing must not create serious health/safety risks
Liability	Article 60(9)	Provider remains liable under applicable Union and national liability law

Testing Plan Contents

Plan Element	Description	Authority Review Focus
System description	Technical details of the AI being tested	Understanding what's being tested
Testing objectives	What will be validated, success criteria	Clarity and measurability
Testing methodology	How testing will be conducted	Scientific validity
Subject selection	Who will participate, how recruited	Representativeness, vulnerability
Informed consent process	How consent will be obtained and documented	Adequacy and voluntariness
Safeguards	Protections for test subjects	Sufficiency of protections
Monitoring procedures	How testing will be supervised	Ability to detect and respond to issues
Intervention triggers	When testing will be stopped	Clear thresholds for action
Data handling	How data will be collected, used, protected	GDPR compliance
Duration and scope	How long, how many subjects, what contexts	Proportionality
Incident procedures	How incidents will be handled and reported	Response capability

Informed Consent Framework

Consent Requirements (Article 60(4)(i), see also Article 61)

Requirement	Implementation
Freely-given	No coercion, pressure, or undue inducement
Informed	Subject understands what they're consenting to, with information per Article 61(1)
Documented	Written or electronic record of consent
Withdrawable	Subject can withdraw at any time without consequence

Information to Provide

Before consent, test subjects must be informed about:

Information Element	Example Content
Nature of the AI system	"This AI analyses your responses to assess creditworthiness"
Purpose of testing	"We are testing the system's accuracy before market launch"
How they will be affected	"The AI will evaluate your application alongside our standard process"
Risks and safeguards	"There is a risk of incorrect assessment; human review verifies all decisions"
Data collection and use	"Your data will be used for testing only and deleted after 12 months"
Duration of participation	"Testing lasts 3 months; your involvement is for one application"
Right to withdraw	"You can withdraw at any time; your application continues normally"
Contact for questions	"Contact our Data Protection Officer at dpo@company.com"
Testing identification number	The Union-wide unique single identification number of the testing per Article 61(1)(e)

Consent Exceptions

Situation	Exception Basis	Requirements
Law enforcement/migration	Article 60(4)(i) exception	Testing must not negatively affect subjects; personal data deleted after test
Emergency situations	Not explicitly addressed	Likely requires post-hoc consent or exemption

Compliance Note

The consent exception for law enforcement and migration is narrow and requires enhanced safeguards. Don't assume it applies—seek legal advice before using this exception.

Monitoring and Safeguards

Monitoring Framework

Monitoring Element	Implementation	Purpose
Real-time oversight	Dashboard, alerts, human supervisor	Detect issues immediately
Performance tracking	Accuracy, fairness, drift metrics	Validate system performance
Incident detection	Automated and manual detection	Identify problems early
Subject feedback	Channels for concerns/complaints	Capture subjective impacts
Documentation	Comprehensive logging	Evidence for conformity assessment

Intervention Triggers

Define clear thresholds for action:

Trigger	Response
Performance below threshold	Pause testing, investigate, remediate
Bias detected	Suspend testing for affected groups, analyse
Harm to subject	Stop testing immediately, support subject, report
Subject withdrawal	Remove from testing, honour consent revocation
Authority request	Immediate compliance with authority instructions

Human Oversight During Testing

Oversight Type	When Required	Implementation
Pre-decision review	High-stakes decisions (e.g., credit, employment)	Human reviews before action
Concurrent monitoring	All testing	Supervisor monitors in real-time
Post-decision review	All AI decisions	Human reviews outcomes
Override capability	Always	Ability to disregard AI output

Special Categories of Testing

Vulnerable Populations

Testing involving vulnerable persons requires enhanced safeguards:

Vulnerable Group	Additional Requirements
Children	Parental/guardian consent, age-appropriate information
Elderly	Accessible information, capacity verification
Employees	No workplace coercion, union consultation if applicable
Patients	Clinical ethics oversight, medical safeguards
Economically dependent	Ensure no exploitation of financial vulnerability

Law Enforcement and Migration

Article 60(4)(i) provides limited exceptions:

Requirement	Implementation
No negative effect	Testing and its outcomes must not have any negative effect on subjects
Personal data deletion	Personal data shall be deleted after the test is performed
Testing plan approval	Market surveillance authority must still approve
Enhanced safeguards	Greater protections than standard testing
Documentation	Comprehensive records for accountability

Duration and Scope Limits

Proportionality Requirements

Factor	Consideration
Maximum duration	6 months, extendable to 12 months per Article 60(4)(f)
Duration	No longer than necessary to achieve testing objectives
Subject numbers	Minimum needed for statistical validity
Scope of decisions	Limited to what's necessary to test
Geographic scope	Appropriate to testing objectives

Typical Testing Parameters

Testing Type	Typical Duration	Typical Scale
Pilot testing	1-3 months	50-500 subjects
Extended validation	3-6 months	500-5,000 subjects
Pre-launch testing	1-2 months	1,000-10,000 subjects

Documentation Requirements

Real-Time Documentation

Document	Contents	Update Frequency
Testing log	All testing activities, decisions, events	Continuous
Incident register	Issues, near-misses, complaints	As they occur
Consent records	All consent documentation	Per subject
Performance data	Accuracy, fairness, other metrics	Daily/weekly
Monitoring reports	Supervisor observations	Weekly

Post-Testing Documentation

Document	Purpose	Retention
Testing summary report	Overall findings, conclusions	10 years minimum
Performance validation	Evidence for conformity assessment	10 years minimum
Incident summary	All issues and resolutions	10 years minimum
Subject outcomes	What happened to test subjects	As required by GDPR

Authority Notification and Approval

Approval Process

Ongoing Reporting

Report Type	Timing	Contents
Progress reports	Monthly during testing	Activities, metrics, issues
Incident reports	Immediately upon occurrence	Details, response, remediation
Completion report	End of testing	Summary, findings, recommendations

Liability and Insurance

Liability Framework (Article 60(9))

Liability Aspect	Provider Responsibility
Harm to test subjects	Provider remains fully liable under applicable Union and national liability law
Insurance (recommended)	While not explicitly required by the AI Act, adequate insurance is a prudent practice
No liability transfer	Cannot contract away liability to subjects

Insurance Considerations

Coverage Type	What It Covers
Product liability	Harm caused by the AI system
Professional indemnity	Errors in testing design or execution
Clinical trials (if applicable)	Medical testing-specific coverage
Cyber liability	Data breaches during testing

Real-World Testing Checklist

Pre-Testing

Develop comprehensive testing plan
Identify and assess risks to test subjects
Design safeguards and monitoring procedures
Create informed consent materials
Establish intervention triggers and procedures
Obtain liability insurance/coverage
Submit plan to market surveillance authority
Obtain authority approval

During Testing

Obtain informed consent from all subjects
Activate monitoring systems
Document all activities and decisions
Submit regular progress reports
Report incidents immediately
Maintain human oversight
Respond to any authority requests

Post-Testing

Complete testing summary report
Compile performance validation evidence
Document all incidents and resolutions
Notify authority of testing completion
Retain all documentation (10+ years)
Use findings in conformity assessment

Real-World Testing