aicomply.
STD-AI-003

AI Data Governance Standard

Ensures AI training, validation, and testing datasets meet quality, relevance, and representativeness standards.

3

Controls

0

Compliant

3

In Progress

0

Not Started

Overall Progress
55%
Implementation Guidance
Detailed guidance for implementing this standard

AI Data Governance Standard

Document Type: Standard
Standard ID: STD-AI-003
Standard Title: AI Data Governance Standard
Version: 1.0
Effective Date: 2025-08-01
Next Review Date: 2026-08-01
Review Frequency: Annually or upon regulatory change
Parent Policy: POL-AI-001 - Artificial Intelligence Policy
Owner: Chief Data Officer
Approved By: AI Governance Committee Chair
Status: Draft
Classification: Internal Use Only


TABLE OF CONTENTS

  1. Document History
  2. Objective
  3. Scope and Applicability
  4. Control Standard
  5. Supporting Procedures
  6. Compliance
  7. Roles and Responsibilities
  8. Exceptions
  9. Enforcement
  10. Key Performance Indicators (KPIs)
  11. Training Requirements
  12. Definitions
  13. Link with AI Act and ISO42001

DOCUMENT HISTORY

VersionDateAuthorChangesApproval DateApproved By
0.12025-06-20Emily White, Chief Data OfficerInitial draft--
0.22025-07-05Emily White, Chief Data OfficerAdded bias mitigation details--
0.32025-07-22Emily White, Chief Data OfficerIncorporated DPO feedback--
1.02025-08-01Emily White, Chief Data OfficerFinal version approved - GRC restructured2025-07-25Jane Doe, AI Governance Committee Chair

OBJECTIVE

This standard defines requirements for AI training, validation, and testing datasets to ensure they meet quality, relevance, representativeness, and bias mitigation standards in compliance with EU AI Act Article 10.

Primary Goals:

  • Ensure all AI datasets meet quality standards appropriate for intended purpose
  • Ensure datasets are relevant, representative, and appropriate for intended purpose
  • Detect and mitigate biases in datasets to prevent discriminatory outcomes
  • Establish comprehensive data governance with full lineage tracking
  • Ensure AI data governance complies with GDPR and privacy requirements

SCOPE AND APPLICABILITY

2.1 Mandatory Applicability

This standard is mandatory for:

  • All high-risk AI systems (EU AI Act Article 10)
  • All data used for AI system development, training, validation, testing, and operation

2.2 Recommended Applicability

This standard is recommended for:

  • All AI systems using training data
  • Limited-risk and minimal-risk AI systems (voluntary best practices)

2.3 Data Types Covered

  • Training datasets
  • Validation datasets
  • Testing datasets
  • Input data (operational)
  • Synthetic data
  • Third-party datasets
  • Open-source datasets

2.4 Out of Scope

  • General enterprise data governance (covered by enterprise data governance framework)
  • Non-AI system data (covered by other data governance standards)
  • Data outside EU AI Act scope

CONTROL STANDARD

Control DATA-001: Data Quality Requirements Definition

Control ID: DATA-001
Control Name: Data Quality Requirements Definition
Control Type: Preventive
Control Frequency: Per AI system, annually
Risk Level: High

Control Objective

Define specific data quality requirements for each AI system based on intended purpose and risk level to ensure datasets meet appropriate quality standards before use in AI training, validation, and testing, in compliance with EU AI Act Article 10(2).

Control Requirements

CR-001.1: Data Quality Requirements Documentation

Define and document data quality requirements for each AI system with specific thresholds based on risk level.

Data Quality Dimensions:

DimensionDefinitionHigh-Risk AI RequirementLimited/Minimal-Risk AI Requirement
AccuracyCorrectness of data values≥95%≥90%
CompletenessPresence of all required data≥98%≥95%
ConsistencyUniformity across datasets100% for critical fields≥95% for critical fields
TimelinessData currency and freshnessDefined per use caseDefined per use case
ValidityConformance to defined formats100%≥98%
UniquenessNo unintended duplicates≥99%≥95%

Mandatory Actions:

  • Define quality requirements per AI system
  • Document quality thresholds in Data Quality Requirements Document
  • Obtain AI System Owner approval
  • Review and update annually
  • Align with intended purpose and risk level

Evidence Required:

  • Data Quality Requirements Document (DOC-AI-DATA-001)
  • Quality thresholds by AI system
  • Approval records
  • Annual review records

Audit Verification:

  • Verify quality requirements defined for all AI systems
  • Confirm thresholds appropriate for risk level
  • Check approval obtained
  • Validate annual review completed

Control DATA-002: Data Quality Assessment

Control ID: DATA-002
Control Name: Data Quality Assessment and Validation
Control Type: Preventive
Control Frequency: Before training, after significant data updates
Risk Level: High

Control Objective

Assess data quality against defined requirements before use in AI training to ensure datasets meet quality thresholds and prevent quality issues from affecting AI system performance and compliance.

Control Requirements

CR-002.1: Pre-Training Quality Assessment

Conduct comprehensive data quality assessment using multiple methods before datasets are used for AI training.

Assessment Methods:

  • Automated data profiling
  • Statistical analysis
  • Manual sampling and review
  • Data quality tools

Mandatory Actions:

  • Profile datasets to assess all quality dimensions
  • Calculate quality metrics for each dimension
  • Compare against defined thresholds
  • Document assessment results
  • Remediate quality issues before use
  • Obtain quality approval before training begins

Quality Assessment Checklist:

  • Accuracy assessed (correctness of values)
  • Completeness assessed (presence of required data)
  • Consistency assessed (uniformity across datasets)
  • Timeliness assessed (data currency)
  • Validity assessed (format conformance)
  • Uniqueness assessed (duplicate detection)
  • All metrics meet thresholds
  • Quality issues identified and remediated
  • Quality approval obtained

Evidence Required:

  • Data Quality Assessment Report (RPT-AI-DATA-001)
  • Quality metrics dashboard
  • Remediation records
  • Quality approval records

Audit Verification:

  • Verify quality assessment conducted before training
  • Confirm all quality dimensions assessed
  • Check metrics meet defined thresholds
  • Validate quality issues remediated before use
  • Verify quality approval obtained

Control DATA-003: Data Quality Monitoring

Control ID: DATA-003
Control Name: Continuous Data Quality Monitoring
Control Type: Detective
Control Frequency: Continuous
Risk Level: Medium

Control Objective

Continuously monitor data quality during AI system operation to detect quality degradation, identify quality issues early, and enable timely remediation to maintain AI system performance and compliance.

Control Requirements

CR-003.1: Operational Quality Monitoring

Implement automated quality monitoring with alerting and periodic reporting.

Mandatory Actions:

  • Implement automated quality checks
  • Monitor quality metrics in real-time
  • Alert on quality threshold breaches
  • Investigate quality issues promptly
  • Remediate quality issues
  • Report quality trends monthly

Quality Monitoring Metrics:

MetricThresholdAlert LevelFrequency
Accuracy< 95% (high-risk) / < 90% (other)WarningDaily
Completeness< 98% (high-risk) / < 95% (other)WarningDaily
Consistency< 100% (critical fields)CriticalReal-time
Validity< 100%CriticalReal-time
Uniqueness< 99% (high-risk) / < 95% (other)WarningDaily

Evidence Required:

  • Quality monitoring dashboard
  • Alert logs
  • Remediation records
  • Monthly quality reports (RPT-AI-DATA-QM-XXX)

Audit Verification:

  • Verify automated quality monitoring implemented
  • Confirm alerting configured for threshold breaches
  • Check quality issues investigated and remediated
  • Validate monthly quality reports generated

Control DATA-004: Data Relevance Assessment

Control ID: DATA-004
Control Name: Data Relevance and Purpose Alignment Assessment
Control Type: Preventive
Control Frequency: Per AI system, annually
Risk Level: High

Control Objective

Assess and document that datasets are relevant to intended purpose and geographical/behavioral/functional setting to ensure AI systems are trained on appropriate data that reflects their actual deployment context, in compliance with EU AI Act Article 10(3) and 10(4).

Control Requirements

CR-004.1: Comprehensive Relevance Assessment

Conduct relevance assessment across all relevant dimensions.

Relevance Criteria:

CriterionAssessment QuestionsEvidence Required
Purpose AlignmentDoes data reflect intended use case?Use case to data mapping
Geographical RelevanceIs data from relevant locations?Geographic distribution analysis
Temporal RelevanceIs data from relevant time period?Temporal distribution analysis
Behavioral RelevanceDoes data reflect actual user behaviors?Behavioral pattern analysis
Functional RelevanceDoes data cover all system functions?Function coverage analysis

Mandatory Actions:

  • Define intended purpose clearly
  • Identify target population/scenarios
  • Assess dataset coverage of target
  • Document relevance justification
  • Obtain stakeholder approval
  • Review relevance annually

Evidence Required:

  • Data Relevance Assessment (ASSESS-AI-DATA-001)
  • Purpose-to-data mapping
  • Stakeholder approval records
  • Annual review records

Audit Verification:

  • Verify relevance assessment conducted for all AI systems
  • Confirm all relevance criteria assessed
  • Check relevance justification documented
  • Validate stakeholder approval obtained

Control DATA-005: Representativeness Analysis

Control ID: DATA-005
Control Name: Dataset Representativeness Assessment
Control Type: Preventive
Control Frequency: Before training, annually
Risk Level: High

Control Objective

Ensure datasets are sufficiently representative of all persons/situations AI system will encounter to prevent bias and ensure fair treatment across all user groups and scenarios, in compliance with EU AI Act Article 10(3).

Control Requirements

CR-005.1: Comprehensive Representativeness Analysis

Analyze dataset representativeness across all relevant subpopulations and protected characteristics.

Representativeness Metrics:

MetricDefinitionTarget
Demographic ParityEqual representation across protected characteristicsRatio 0.8-1.2
Coverage% of target population represented≥90%
BalanceDistribution similarity to real-worldWithin 10%
DiversityVariety of scenarios/cases includedComprehensive

Protected Characteristics to Assess:

  • Gender
  • Age groups
  • Ethnicity/race (where legally permissible)
  • Disability status
  • Geographic location
  • Socioeconomic status

Mandatory Actions:

  • Identify all relevant subpopulations
  • Analyze dataset distribution across subpopulations
  • Calculate representativeness metrics
  • Compare to target population distribution
  • Address underrepresentation
  • Document representativeness analysis

Evidence Required:

  • Representativeness Analysis Report (RPT-AI-DATA-002)
  • Distribution comparisons
  • Gap analysis
  • Mitigation plans (if underrepresentation identified)

Audit Verification:

  • Verify representativeness analysis conducted before training
  • Confirm all protected characteristics assessed
  • Check representativeness metrics calculated
  • Validate underrepresentation addressed
  • Verify mitigation plans created if needed

Control DATA-006: Dataset Appropriateness Evaluation

Control ID: DATA-006
Control Name: Dataset Selection and Appropriateness Assessment
Control Type: Preventive
Control Frequency: Per AI system, when changing datasets
Risk Level: Medium

Control Objective

Ensure datasets are appropriate considering state of the art and available alternatives to ensure optimal dataset selection and justify dataset choices for AI system development.

Control Requirements

CR-006.1: Dataset Appropriateness Justification

Evaluate and justify dataset selection against available alternatives and state of the art.

Appropriateness Factors:

FactorEvaluation CriteriaDocumentation Required
Quality vs. AlternativesHow does quality compare to other datasets?Quality comparison analysis
Representativeness vs. AlternativesHow does representativeness compare?Representativeness comparison
Cost vs. ValueIs cost justified by value?Cost-benefit analysis
Privacy ImplicationsWhat are privacy risks?Privacy impact assessment
Bias RisksWhat are bias risks compared to alternatives?Bias risk comparison
State of the ArtDoes dataset align with industry best practices?State of the art review

Mandatory Actions:

  • Research available datasets
  • Compare dataset options
  • Justify dataset selection
  • Document why chosen dataset is appropriate
  • Consider synthetic data alternatives
  • Obtain approval for dataset selection

Evidence Required:

  • Dataset Selection Justification (DOC-AI-DATA-002)
  • Alternative analysis
  • State of the art review
  • Approval records

Audit Verification:

  • Verify dataset appropriateness evaluated
  • Confirm alternatives considered
  • Check justification documented
  • Validate approval obtained

Control DATA-007: Bias Examination

Control ID: DATA-007
Control Name: Bias Detection and Assessment
Control Type: Preventive
Control Frequency: Before training, after dataset updates
Risk Level: High

Control Objective

Examine training, validation, and testing datasets for possible biases to identify discriminatory patterns before they are learned by AI models, preventing bias propagation to AI system outputs, in compliance with EU AI Act Article 10(2)(f) and (g).

Control Requirements

CR-007.1: Comprehensive Bias Assessment

Conduct bias assessment across all bias types and protected characteristics.

Bias Types to Assess:

Bias TypeDescriptionDetection Method
Historical BiasData reflects past discriminationCompare to equitable baseline
Representation BiasUnderrepresentation of groupsDistribution analysis
Measurement BiasBiased labels or featuresLabel quality review
Aggregation BiasInappropriate groupingSubgroup analysis
Evaluation BiasBiased test dataTest set representativeness

Mandatory Actions:

  • Conduct bias assessment before training
  • Use statistical methods to detect bias
  • Use bias detection tools (e.g., IBM AI Fairness 360, Aequitas)
  • Document bias findings
  • Assess bias impact on fairness
  • Assess all protected characteristics

Evidence Required:

  • Bias Assessment Report (RPT-AI-DATA-003)
  • Bias metrics by protected characteristic
  • Bias detection tool outputs
  • Impact assessment

Audit Verification:

  • Verify bias assessment conducted before training
  • Confirm all bias types assessed
  • Check bias detection tools used
  • Validate bias findings documented
  • Verify protected characteristics assessed

Control DATA-008: Bias Mitigation

Control ID: DATA-008
Control Name: Bias Mitigation Implementation
Control Type: Corrective
Control Frequency: After bias detection
Risk Level: High

Control Objective

Implement appropriate measures to mitigate detected biases in datasets to reduce discriminatory outcomes and improve fairness across all protected characteristics.

Control Requirements

CR-008.1: Bias Mitigation Strategy Selection and Implementation

Select and implement appropriate bias mitigation strategies based on bias type and context.

Mitigation Strategies:

StrategyWhen to UseImplementationEffectiveness Validation
Data RebalancingRepresentation biasOversample underrepresented groupsPost-mitigation distribution analysis
Data AugmentationInsufficient diversityGenerate synthetic samplesDiversity metrics
Feature EngineeringBiased featuresRemove or transform biased featuresFeature importance analysis
ReweightingImbalanced classesAssign weights to balance influenceWeighted performance metrics
RelabelingLabel biasReview and correct biased labelsLabel quality review
Fairness ConstraintsModel-level biasApply fairness constraints during trainingFairness metrics validation

Mandatory Actions:

  • Select appropriate mitigation strategies
  • Implement mitigation measures
  • Validate mitigation effectiveness
  • Document mitigation actions
  • Monitor for residual bias
  • Obtain approval for mitigation approach

Evidence Required:

  • Bias Mitigation Plan (PLAN-AI-DATA-001)
  • Mitigation implementation records
  • Post-mitigation bias assessment
  • Effectiveness validation records
  • Approval records

Audit Verification:

  • Verify mitigation strategies selected for all identified biases
  • Confirm mitigation measures implemented
  • Check mitigation effectiveness validated
  • Validate mitigation actions documented
  • Verify residual bias monitored

Control DATA-009: Fairness Validation

Control ID: DATA-009
Control Name: Fairness Metrics Validation
Control Type: Detective
Control Frequency: After mitigation, before deployment
Risk Level: High

Control Objective

Validate that bias mitigation achieves fairness objectives by calculating and verifying fairness metrics meet defined thresholds, ensuring AI systems treat all groups fairly across protected characteristics.

Control Requirements

CR-009.1: Fairness Metrics Calculation and Validation

Calculate fairness metrics and validate they meet defined thresholds.

Fairness Metrics:

MetricDefinitionTargetMeasurement
Demographic ParityEqual positive prediction rate across groupsRatio 0.8-1.2(Positive rate group A) / (Positive rate group B)
Equal OpportunityEqual true positive rate across groupsRatio 0.8-1.2(TPR group A) / (TPR group B)
Equalized OddsEqual TPR and FPR across groupsRatio 0.8-1.2(TPR/FPR group A) / (TPR/FPR group B)
Predictive ParityEqual precision across groupsRatio 0.8-1.2(Precision group A) / (Precision group B)
CalibrationEqual calibration across groupsWithin 5%Calibration difference between groups

Mandatory Actions:

  • Calculate fairness metrics for all protected characteristics
  • Compare to fairness thresholds
  • Assess trade-offs (accuracy vs. fairness)
  • Document fairness validation
  • Obtain approval for fairness-accuracy balance
  • Block deployment if fairness thresholds not met

Evidence Required:

  • Fairness Validation Report (RPT-AI-DATA-004)
  • Fairness metrics dashboard
  • Trade-off analysis
  • Approval records

Audit Verification:

  • Verify fairness metrics calculated for all protected characteristics
  • Confirm metrics meet defined thresholds
  • Check trade-off analysis documented
  • Validate approval obtained
  • Verify deployment blocked if thresholds not met

Control DATA-010: Data Lineage Documentation

Control ID: DATA-010
Control Name: Data Lineage Tracking and Documentation
Control Type: Detective
Control Frequency: Continuous updates
Risk Level: Medium

Control Objective

Document complete data lineage from source to AI model to enable traceability, support audits, facilitate troubleshooting, and demonstrate data governance compliance.

Control Requirements

CR-010.1: Complete Lineage Documentation

Document all lineage elements for every dataset used in AI systems.

Lineage Elements:

ElementDescriptionRequired Info
Data SourceOriginal data originSource system, provider, date, contact
Collection MethodHow data was collectedMethod, tools, responsible party, date
TransformationsAll data processing stepsTransformation type, date, person, tools
Quality ChecksQuality assessments performedCheck type, results, date, person
VersionsDataset versionsVersion number, changes, date, changelog
UsageWhich AI systems use dataAI system ID, purpose, date, status

Mandatory Actions:

  • Document data lineage for all datasets
  • Maintain lineage in data catalog
  • Update lineage with each transformation
  • Enable lineage traceability (source to model)
  • Provide lineage reports on demand
  • Maintain lineage for 10 years

Evidence Required:

  • Data Lineage Documentation (DOC-AI-DATA-003)
  • Data catalog with lineage
  • Lineage diagrams
  • Transformation logs
  • Lineage reports

Audit Verification:

  • Verify lineage documented for all datasets
  • Confirm all lineage elements captured
  • Check lineage updated with transformations
  • Validate traceability enabled
  • Verify 10-year retention

Control DATA-011: Data Provenance Verification

Control ID: DATA-011
Control Name: Data Provenance and Legal Compliance
Control Type: Preventive
Control Frequency: Per dataset acquisition
Risk Level: High

Control Objective

Establish and verify data provenance for all AI datasets to ensure legal compliance, protect intellectual property, and enable regulatory compliance, in compliance with GDPR and data protection requirements.

Control Requirements

CR-011.1: Comprehensive Provenance Verification

Verify and document all provenance requirements for each dataset.

Provenance Requirements:

RequirementVerification MethodDocumentation Required
Source VerificationConfirm data source authenticitySource verification records
LicensingDocument data usage rightsLicense agreements, terms
ConsentVerify consent for data use (if personal data)Consent records, consent management logs
Legal BasisDocument legal basis for processing (GDPR)Legal basis documentation
Third-Party AgreementsMaintain data sharing agreementsData sharing agreements, contracts

Mandatory Actions:

  • Verify data source for all datasets
  • Document licensing terms
  • Obtain and document consent (where required)
  • Establish legal basis for processing
  • Maintain third-party agreements
  • Conduct provenance audits annually

Evidence Required:

  • Data Provenance Records (REC-AI-DATA-001)
  • Licenses and agreements
  • Consent records
  • Legal basis documentation
  • Third-party agreement records
  • Annual provenance audit reports

Audit Verification:

  • Verify provenance verified for all datasets
  • Confirm licensing terms documented
  • Check consent obtained where required
  • Validate legal basis established
  • Verify third-party agreements maintained
  • Check annual provenance audits completed

Control DATA-012: Data Catalog Management

Control ID: DATA-012
Control Name: Data Catalog Maintenance and Discovery
Control Type: Detective
Control Frequency: Continuous
Risk Level: Medium

Control Objective

Maintain comprehensive data catalog for all AI datasets to enable data discovery, support data governance, facilitate compliance, and enable efficient data management.

Control Requirements

CR-012.1: Data Catalog Completeness and Maintenance

Maintain complete, current, and searchable data catalog for all AI datasets.

Catalog Contents:

FieldDescriptionMandatory
Dataset name and IDUnique identifier and nameYES
Description and purposeWhat data is and why it existsYES
Data schemaStructure and formatYES
Data quality metricsCurrent quality scoresYES
Lineage informationComplete lineageYES
Provenance recordsSource, licensing, consentYES
Access controlsWho can access and howYES
Usage trackingWhich AI systems use itYES
Retention periodHow long data is keptYES
Related AI systemsLinks to AI systemsYES

Mandatory Actions:

  • Create catalog entries for all datasets
  • Update catalog continuously
  • Enable search and discovery
  • Provide metadata management
  • Integrate with data governance tools
  • Maintain catalog accuracy

Evidence Required:

  • Data Catalog (CATALOG-AI-DATA-001)
  • Catalog completeness metrics
  • Usage reports
  • Catalog update logs

Audit Verification:

  • Verify catalog entries for all datasets
  • Confirm all mandatory fields populated
  • Check catalog updated continuously
  • Validate search and discovery enabled
  • Verify catalog accuracy

Control DATA-013: Privacy Impact Assessment

Control ID: DATA-013
Control Name: Data Protection Impact Assessment (DPIA)
Control Type: Preventive
Control Frequency: Per high-risk AI system, after substantial modifications
Risk Level: High

Control Objective

Conduct Data Protection Impact Assessment (DPIA) for high-risk AI systems processing personal data to identify and mitigate privacy risks, ensure GDPR compliance, and protect data subject rights.

Control Requirements

CR-013.1: DPIA Execution and Documentation

Conduct comprehensive DPIA per GDPR Article 35 for all high-risk AI systems processing personal data.

DPIA Triggers:

  • Systematic and extensive profiling
  • Large-scale processing of special category data
  • Systematic monitoring of publicly accessible areas
  • High-risk AI system processing personal data

DPIA Contents (per GDPR Article 35):

  • Systematic description of processing operations
  • Assessment of necessity and proportionality
  • Assessment of risks to rights and freedoms
  • Measures to address risks
  • Safeguards, security measures, and mechanisms

Mandatory Actions:

  • Determine if DPIA required
  • Conduct DPIA per GDPR Article 35
  • Identify privacy risks
  • Implement privacy controls
  • Obtain DPO approval
  • Consult supervisory authority if high risk
  • Review DPIA after substantial modifications

Evidence Required:

  • Data Protection Impact Assessment (DPIA-AI-XXX)
  • Privacy risk assessment
  • DPO approval records
  • Supervisory authority consultation records (if applicable)
  • Privacy control implementation records

Audit Verification:

  • Verify DPIA conducted for all high-risk AI processing personal data
  • Confirm DPIA contents complete per GDPR Article 35
  • Check DPO approval obtained
  • Validate privacy controls implemented
  • Verify supervisory authority consulted if high risk

Control DATA-014: Data Minimization

Control ID: DATA-014
Control Name: Data Minimization and Purpose Limitation
Control Type: Preventive
Control Frequency: Per AI system, annually
Risk Level: Medium

Control Objective

Collect and process only data necessary for AI system purpose to comply with GDPR Article 5(1)(c) data minimization principle and reduce privacy risks.

Control Requirements

CR-014.1: Data Minimization Assessment and Implementation

Assess and implement data minimization for all AI systems processing personal data.

Mandatory Actions:

  • Define minimum data requirements
  • Justify each data element
  • Remove unnecessary data
  • Implement data minimization in pipelines
  • Review data scope regularly (annually)
  • Document minimization decisions

Data Minimization Assessment:

Data ElementPurposeNecessity JustificationCan Be Removed?Action
[Example][Purpose][Justification]Yes/NoRemove/Keep

Evidence Required:

  • Data Minimization Assessment (ASSESS-AI-DATA-002)
  • Justification for each data element
  • Data reduction records
  • Annual review records

Audit Verification:

  • Verify data minimization assessed for all AI systems
  • Confirm each data element justified
  • Check unnecessary data removed
  • Validate data minimization implemented in pipelines
  • Verify annual review completed

Control DATA-015: Data Anonymization and Pseudonymization

Control ID: DATA-015
Control Name: Privacy Protection Techniques
Control Type: Preventive
Control Frequency: Before training, for personal data
Risk Level: High

Control Objective

Apply appropriate anonymization or pseudonymization techniques to protect privacy while enabling AI system development, balancing privacy protection with data utility.

Control Requirements

CR-015.1: Privacy Protection Technique Selection and Implementation

Select and implement appropriate privacy protection techniques based on use case and privacy requirements.

Techniques:

TechniqueUse CaseReversibilityPrivacy LevelData Utility
AnonymizationRemove all identifiers permanentlyIrreversibleHighestMedium
PseudonymizationReplace identifiers with pseudonymsReversible with keyHighHigh
AggregationGroup data to prevent identificationIrreversibleHighLow-Medium
MaskingHide sensitive data elementsVariesMedium-HighHigh
Synthetic DataGenerate artificial dataN/AHighestMedium-High

Mandatory Actions:

  • Assess need for anonymization/pseudonymization
  • Select appropriate technique
  • Implement technique
  • Validate effectiveness
  • Document approach
  • Assess re-identification risk

Re-identification Risk Assessment:

  • Assess risk of re-identifying individuals
  • Consider available auxiliary information
  • Evaluate attack scenarios
  • Document risk level
  • Implement additional protections if needed

Evidence Required:

  • Anonymization/Pseudonymization Plan (PLAN-AI-DATA-002)
  • Implementation records
  • Effectiveness validation
  • Re-identification risk assessment
  • Privacy protection verification

Audit Verification:

  • Verify privacy protection technique selected
  • Confirm technique implemented
  • Check effectiveness validated
  • Validate re-identification risk assessed
  • Verify approach documented

SUPPORTING PROCEDURES

This standard is implemented through the following detailed procedures:

Procedure PROC-AI-DATA-001: Data Quality Assessment Procedure

Purpose: Define step-by-step process for assessing and ensuring data quality
Owner: Chief Data Officer
Implements: Controls DATA-001, DATA-002, DATA-003

Procedure Steps:

  1. Define data quality requirements - Control DATA-001
  2. Profile datasets
  3. Calculate quality metrics - Control DATA-002
  4. Compare to thresholds
  5. Document assessment results
  6. Remediate quality issues
  7. Obtain quality approval
  8. Set up continuous monitoring - Control DATA-003

Outputs:

  • Data Quality Requirements Document
  • Data Quality Assessment Report
  • Quality approval records
  • Monitoring configuration

Procedure PROC-AI-DATA-002: Bias Detection and Mitigation Procedure

Purpose: Define process for detecting and mitigating data bias
Owner: Chief Data Officer
Implements: Controls DATA-007, DATA-008, DATA-009

Procedure Steps:

  1. Conduct bias examination - Control DATA-007
  2. Document bias findings
  3. Select mitigation strategies - Control DATA-008
  4. Implement mitigation measures
  5. Validate mitigation effectiveness
  6. Calculate fairness metrics - Control DATA-009
  7. Verify fairness thresholds met
  8. Obtain fairness approval

Outputs:

  • Bias Assessment Report
  • Bias Mitigation Plan
  • Fairness Validation Report
  • Approval records

Procedure PROC-AI-DATA-003: Data Lineage Documentation Procedure

Purpose: Define process for documenting data lineage
Owner: Chief Data Officer
Implements: Controls DATA-010, DATA-011, DATA-012

Procedure Steps:

  1. Document data source - Control DATA-011
  2. Document collection method
  3. Track all transformations - Control DATA-010
  4. Document quality checks
  5. Maintain version history
  6. Update data catalog - Control DATA-012
  7. Enable lineage traceability
  8. Generate lineage reports

Outputs:

  • Data Lineage Documentation
  • Data Catalog entries
  • Lineage reports

Procedure PROC-AI-DATA-004: Training Data Preparation Procedure

Purpose: Define process for preparing datasets for AI training
Owner: Chief Data Officer
Implements: Controls DATA-004, DATA-005, DATA-006, DATA-013, DATA-014, DATA-015

Procedure Steps:

  1. Assess data relevance - Control DATA-004
  2. Analyze representativeness - Control DATA-005
  3. Evaluate dataset appropriateness - Control DATA-006
  4. Conduct privacy impact assessment - Control DATA-013
  5. Implement data minimization - Control DATA-014
  6. Apply privacy protection techniques - Control DATA-015
  7. Obtain all required approvals
  8. Prepare dataset for training

Outputs:

  • Data Relevance Assessment
  • Representativeness Analysis
  • Dataset Appropriateness Justification
  • DPIA
  • Privacy protection verification
  • Training dataset ready

COMPLIANCE

5.1 Compliance Monitoring

Monitoring Approach: Continuous automated monitoring supplemented by monthly manual reviews and quarterly comprehensive audits.

Compliance Metrics:

MetricTargetMeasurement MethodFrequencyOwner
Data Quality Requirements Coverage100%% of AI systems with documented quality requirementsMonthlyChief Data Officer
Data Quality Threshold Compliance≥95%% of datasets meeting quality thresholdsMonthlyChief Data Officer
Bias Assessment Coverage100%% of high-risk AI with bias assessmentQuarterlyChief Data Officer
Fairness Metric Compliance100%% of AI systems meeting fairness thresholdsQuarterlyChief Data Officer
Lineage Completeness100%% of datasets with complete lineageMonthlyChief Data Officer
DPIA Completion100%% of required DPIAs completedQuarterlyData Protection Officer
Data Minimization Compliance100%% of AI systems with minimization assessmentQuarterlyData Protection Officer
Privacy Protection Implementation100%% of personal data with privacy protectionMonthlyData Protection Officer

Monitoring Tools:

  • Data Quality Dashboard
  • Bias Detection Dashboard
  • Data Catalog
  • Privacy Compliance Dashboard
  • Monthly compliance reports
  • Quarterly AI Governance Committee reviews

5.2 Internal Audit Requirements

Audit Frequency: Annually (minimum)

Audit Scope:

  • Data quality requirements completeness
  • Data quality assessment quality
  • Bias assessment completeness and quality
  • Fairness validation accuracy
  • Data lineage completeness
  • Data provenance verification
  • Data catalog completeness
  • DPIA completeness and quality
  • Data minimization implementation
  • Privacy protection effectiveness
  • Controls effectiveness (DATA-001 through DATA-015)

Audit Activities:

  • Review 100% of data quality requirements
  • Sample 20% of datasets for detailed quality review
  • Review 100% of bias assessments
  • Test fairness metric calculations
  • Verify lineage completeness
  • Check provenance records
  • Review data catalog completeness
  • Test privacy protection effectiveness
  • Interview key personnel

Audit Outputs:

  • Annual Data Governance Audit Report
  • Findings and recommendations
  • Corrective action plans for deficiencies

5.3 External Audit / Regulatory Inspection

Preparation:

  • Maintain audit-ready data governance documentation at all times
  • Designate Chief Data Officer and Data Protection Officer as regulatory liaisons
  • Prepare standard response procedures for authority requests

Provide to Auditors/Regulators:

  • Data Quality Requirements Documents
  • Data Quality Assessment Reports
  • Bias Assessment Reports
  • Fairness Validation Reports
  • Data Lineage Documentation
  • Data Provenance Records
  • Data Catalog
  • DPIAs
  • Data minimization assessments
  • Privacy protection documentation
  • Data governance procedures
  • Internal audit reports
  • Evidence of controls execution

Authority Request Response:

  • Acknowledge request within 1 business day
  • Provide requested documentation within 5 business days
  • Coordinate through Legal, Chief Data Officer, and Data Protection Officer
  • Document all interactions with authorities

ROLES AND RESPONSIBILITIES

6.1 RACI Matrix

ActivityChief Data OfficerData ScientistData EngineerData Protection OfficerAI System OwnerAI Risk ManagerLegal
Data Quality RequirementsR/ACCIAII
Data Quality AssessmentRRCIAII
Data Quality MonitoringRCRIAII
Data Relevance AssessmentRRIIACI
Representativeness AnalysisRRIIARI
Dataset AppropriatenessRRICAIC
Bias ExaminationRRIIARI
Bias MitigationRRCIACI
Fairness ValidationRRIIAR/AI
Data Lineage DocumentationRCRIAII
Data ProvenanceRIIRAIR
Data Catalog ManagementR/ACCIIII
DPIACIIR/AACC
Data MinimizationRCCRAIC
Privacy ProtectionRCCR/AAIC

RACI Legend:

  • R = Responsible (does the work)
  • A = Accountable (ultimately answerable)
  • C = Consulted (provides input)
  • I = Informed (kept up-to-date)

6.2 Role Descriptions

Chief Data Officer

  • Primary Responsibility: Oversees data governance framework, ensures compliance
  • Key Activities:
    • Establishes data governance framework
    • Maintains data catalog
    • Coordinates data quality management
    • Oversees bias detection and mitigation
    • Reports data governance metrics
  • Required Competencies: Data governance expertise, EU AI Act Article 10, GDPR, data quality management

Data Scientist

  • Primary Responsibility: Assesses data quality, bias, representativeness; prepares datasets
  • Key Activities:
    • Conducts data quality assessments
    • Performs bias examinations
    • Analyzes representativeness
    • Implements bias mitigation
    • Calculates fairness metrics
  • Required Competencies: Data science, statistics, bias detection, fairness metrics

Data Engineer

  • Primary Responsibility: Implements data pipelines; maintains lineage
  • Key Activities:
    • Implements data quality checks
    • Maintains data lineage
    • Updates data catalog
    • Implements privacy protection techniques
  • Required Competencies: Data engineering, ETL, data pipeline management

Data Protection Officer (DPO)

  • Primary Responsibility: Ensures GDPR compliance; conducts DPIAs
  • Key Activities:
    • Conducts DPIAs
    • Verifies data provenance
    • Ensures data minimization
    • Validates privacy protection
    • Manages consent
  • Required Competencies: GDPR expertise, privacy impact assessment, data protection

AI System Owner

  • Primary Responsibility: Accountable for data governance of their AI system
  • Key Activities:
    • Defines data requirements
    • Approves datasets
    • Approves data quality requirements
    • Participates in data assessments
  • Required Competencies: AI system knowledge, data requirements definition

AI Risk Manager

  • Primary Responsibility: Assesses bias risks; validates fairness
  • Key Activities:
    • Assesses bias risks
    • Validates fairness metrics
    • Reviews representativeness
    • Integrates with risk management
  • Required Competencies: Risk management, bias assessment, fairness validation

Legal

  • Primary Responsibility: Advises on legal compliance; reviews data agreements
  • Key Activities:
    • Reviews data licensing
    • Verifies legal basis
    • Reviews data sharing agreements
    • Advises on GDPR compliance
  • Required Competencies: GDPR legal expertise, data licensing, contract law

EXCEPTIONS

7.1 Exception Philosophy

Data governance is a critical regulatory compliance activity for high-risk AI systems. Exceptions are granted restrictively and only where compensating controls adequately mitigate risks.


7.2 Allowed Exceptions

The following exceptions may be granted with proper justification and approval:

Exception TypeJustification RequiredMaximum DurationApproval AuthorityCompensating Controls
Simplified Quality Assessment (Minimal-Risk AI)AI system clearly minimal-risk; no significant quality concernsPermanentChief Data OfficerDocument simplified rationale; Annual re-confirmation
Extended Lineage Documentation TimelineTechnical constraints prevent immediate lineage documentation30 daysChief Data OfficerInterim documentation; Accelerated implementation plan
Deferred Bias Assessment (Non-Personal Data)Dataset contains no personal data; no protected characteristicsUntil personal data addedChief Data Officer + AI Risk ManagerDocument rationale; Re-assess if personal data added

7.3 Prohibited Exceptions

The following exceptions cannot be granted under any circumstances:

Skipping data quality assessment for high-risk AI - Mandatory per Article 10(2), no exceptions
Skipping bias assessment for high-risk AI - Mandatory per Article 10(2)(f) and (g), no exceptions
Using datasets below quality thresholds - Creates compliance gaps, performance risk
Skipping DPIA for high-risk AI processing personal data - Mandatory per GDPR Article 35
Operating without data lineage - Required for traceability and compliance


7.4 Exception Request Process

Step 1: Submit Exception Request

  • Complete Exception Request Form (FORM-AI-EXCEPTION-001)
  • Include business justification
  • Propose compensating controls
  • Specify duration requested
  • Attach risk assessment

Step 2: Risk Assessment

  • Chief Data Officer assesses risk of granting exception
  • Evaluates adequacy of compensating controls
  • Documents residual risk

Step 3: Approval

  • Route to appropriate approval authority based on exception type
  • Chief Data Officer approval: Minor exceptions
  • Chief Data Officer + AI Risk Manager: Significant exceptions
  • AI Governance Committee: Critical exceptions

Step 4: Documentation and Monitoring

  • Document exception in Exception Register
  • Assign exception owner
  • Set review date
  • Monitor compensating controls
  • Report exceptions quarterly to AI Governance Committee

Step 5: Exception Review and Closure

  • Review exception at specified review date
  • Assess if exception still needed
  • Close exception when normal data governance completed
  • Document lessons learned

ENFORCEMENT

8.1 Non-Compliance Consequences

ViolationSeverityConsequenceRemediation Required
High-risk AI without data quality assessmentCriticalImmediate suspension until assessment completedComplete assessment within 10 business days; Root cause analysis
High-risk AI without bias assessmentCriticalImmediate suspension until assessment completedComplete bias assessment within 10 business days
Using datasets below quality thresholdsHighHalt training until quality issues resolvedRemediate quality issues; Re-assess; Re-approve
Missing DPIA for high-risk AICriticalImmediate suspension; GDPR violationComplete DPIA within 15 business days; Implement privacy controls
Missing data lineageMediumWritten warning; Escalation to managementComplete lineage documentation within 10 business days
Fairness thresholds not metHighBlock deployment until fairness achievedImplement additional mitigation; Re-validate fairness
Missing data minimization assessmentMediumWritten warningComplete assessment within 10 business days

8.2 Escalation Procedures

Level 1: Chief Data Officer

  • Minor procedural violations
  • Documentation deficiencies
  • Timeline delays < 5 days
  • Action: Written warning, corrective action required

Level 2: Chief Data Officer + AI Risk Manager

  • Repeated violations
  • Missing bias assessments
  • Quality threshold breaches
  • Action: Formal review, corrective action plan, management notification

Level 3: AI Governance Committee

  • High-risk AI without data quality/bias assessment
  • Missing DPIA
  • GDPR violations
  • Action: Immediate AI system suspension, investigation, disciplinary action

Level 4: Executive Management + Legal + DPO

  • Potential regulatory enforcement action
  • Significant GDPR violations
  • Reputational risk
  • Action: Executive crisis management, legal strategy, regulatory engagement

8.3 Immediate Escalation Triggers

Escalate immediately to AI Governance Committee + Legal + DPO if:

  • ⚠️ High-risk AI system operating without data quality/bias assessment
  • ⚠️ GDPR violation identified (missing DPIA, unauthorized processing)
  • ⚠️ Data breach or privacy incident
  • ⚠️ Regulatory inquiry or inspection related to data governance
  • ⚠️ Significant fairness violations identified in production

8.4 Disciplinary Actions

Individuals responsible for data governance violations may be subject to:

  • Verbal or written warning
  • Mandatory retraining
  • Performance improvement plan
  • Reassignment of responsibilities
  • Suspension (with pay during investigation)
  • Termination (for egregious violations, e.g., knowingly using biased data)

Factors Considered:

  • Intent (knowing violation vs. honest mistake)
  • Severity of violation
  • Impact (actual or potential)
  • Cooperation with remediation
  • Prior violation history

KEY PERFORMANCE INDICATORS (KPIs)

9.1 Data Governance KPIs

KPI IDKPI NameDefinitionTargetMeasurement MethodFrequencyOwnerReporting To
KPI-DATA-001Data Quality Requirements Coverage% of AI systems with documented quality requirements100%(# AI systems with requirements / # total AI systems) × 100MonthlyChief Data OfficerAI Governance Committee
KPI-DATA-002Data Quality Threshold Compliance% of datasets meeting quality thresholds≥95%(# datasets meeting thresholds / # total datasets) × 100MonthlyChief Data OfficerManagement
KPI-DATA-003Bias Assessment Coverage% of high-risk AI with bias assessment100%(# high-risk AI with bias assessment / # high-risk AI) × 100QuarterlyChief Data OfficerAI Governance Committee
KPI-DATA-004Fairness Metric Compliance% of AI systems meeting fairness thresholds100%(# AI systems meeting thresholds / # total AI systems) × 100QuarterlyChief Data OfficerAI Governance Committee
KPI-DATA-005Lineage Completeness% of datasets with complete lineage100%(# datasets with complete lineage / # total datasets) × 100MonthlyChief Data OfficerManagement
KPI-DATA-006DPIA Completion% of required DPIAs completed100%(# DPIAs completed / # required DPIAs) × 100QuarterlyData Protection OfficerAI Governance Committee
KPI-DATA-007Data Minimization Compliance% of AI systems with minimization assessment100%(# AI systems with assessment / # total AI systems) × 100QuarterlyData Protection OfficerAI Governance Committee
KPI-DATA-008Privacy Protection Coverage% of personal data with privacy protection100%(# datasets with protection / # datasets with personal data) × 100MonthlyData Protection OfficerManagement
KPI-DATA-009Data Catalog Completeness% of datasets in catalog with all mandatory fields100%(# complete catalog entries / # total datasets) × 100MonthlyChief Data OfficerManagement
KPI-DATA-010Quality Issue Resolution TimeAverage days to resolve quality issues< 5 daysΣ (resolution date - issue date) / # issuesMonthlyChief Data OfficerManagement

9.2 KPI Dashboards and Reporting

Real-Time Dashboard (Chief Data Officer access)

  • Current data quality scores
  • Bias assessment status
  • Fairness metrics
  • Lineage completeness
  • DPIA status
  • Privacy protection coverage

Monthly Management Report

  • KPI-DATA-001, 002, 005, 008, 009, 010
  • Trend analysis (vs. previous month)
  • Issues and risks
  • Planned actions

Quarterly AI Governance Committee Report

  • All KPIs
  • Bias assessment completion status
  • Fairness compliance status
  • DPIA completion status
  • Internal audit findings (if conducted)
  • Exception register review

Annual Executive Report

  • Full-year KPI performance
  • Data governance maturity assessment
  • Strategic recommendations
  • Regulatory outlook

9.3 KPI Thresholds and Alerts

KPIGreen (Good)Yellow (Warning)Red (Critical)Alert Action
Data Quality Requirements Coverage100%95-99%< 95%Red: Immediate escalation to AI Governance Committee Chair
Data Quality Threshold Compliance≥95%90-94%< 90%Red: Escalate to AI Governance Committee
Bias Assessment Coverage100%90-99%< 90%Red: Halt high-risk AI deployments until assessed
Fairness Metric Compliance100%90-99%< 90%Red: Block deployments until fairness achieved
DPIA Completion100%90-99%< 90%Red: Immediate escalation to DPO + Legal

TRAINING REQUIREMENTS

10.1 Training Program Overview

All personnel involved in AI data governance must complete role-specific training to ensure competency in data quality management, bias detection, GDPR compliance, and data governance procedures.


10.2 Role-Based Training Requirements

RoleTraining CourseDurationContentFrequencyAssessment Required
Chief Data OfficerData Governance Expert Training20 hoursEU AI Act Article 10; Data quality management; Bias detection; GDPR; Data lineage; Data catalog managementInitial + annuallyYes - Written exam (≥90%) + Practical data governance exercise
Data ScientistsData Quality and Bias Assessment16 hoursData quality dimensions; Bias types; Fairness metrics; Bias mitigation; Representativeness analysisInitial + annuallyYes - Practical bias assessment exercise
Data EngineersData Lineage and Pipeline Management12 hoursData lineage documentation; Data catalog; Privacy protection techniques; Pipeline implementationInitial + annuallyYes - Practical lineage documentation exercise
Data Protection OfficerGDPR and Privacy for AI Data16 hoursGDPR Article 35 (DPIA); Data minimization; Privacy protection techniques; Consent management; Legal basisInitial + annuallyYes - Written exam (≥90%)
AI System OwnersData Governance Overview6 hoursData quality requirements; Bias awareness; Data relevance; ResponsibilitiesAt onboarding + annuallyYes - Knowledge check (≥80%)
All AI Development StaffData Quality Awareness2 hoursData quality basics; Bias awareness; Data minimizationAt onboarding + annuallyYes - Knowledge check (≥80%)

10.3 Training Content by Topic

Data Quality Management

  • Data quality dimensions (accuracy, completeness, consistency, etc.)
  • Quality requirements definition
  • Quality assessment methods
  • Quality monitoring
  • Quality remediation

Bias Detection and Mitigation

  • Bias types (historical, representation, measurement, etc.)
  • Protected characteristics
  • Bias detection methods
  • Bias mitigation strategies
  • Fairness metrics

Data Relevance and Representativeness

  • Relevance criteria
  • Representativeness metrics
  • Protected characteristics
  • Underrepresentation identification
  • Dataset appropriateness evaluation

Data Lineage and Provenance

  • Lineage documentation
  • Provenance verification
  • Data catalog management
  • Traceability

GDPR and Privacy

  • DPIA requirements
  • Data minimization
  • Privacy protection techniques
  • Consent management
  • Legal basis

10.4 Training Delivery Methods

Initial Training:

  • Instructor-led classroom or virtual training
  • Includes interactive exercises and case studies
  • Hands-on practice with data governance tools
  • Group discussions of complex scenarios

Annual Refresher:

  • E-learning modules for core content review
  • Live update sessions for regulatory changes
  • Case study reviews of recent data governance activities
  • Knowledge assessment

On-the-Job Training:

  • Mentoring for new data governance staff
  • Job shadowing during data assessments
  • Supervised data governance activities for first 5 AI systems

Just-in-Time Training:

  • Quick reference guides and job aids
  • Video tutorials on specific topics
  • Help desk support from experienced data governance staff

10.5 Training Effectiveness Measurement

Assessment Methods:

  • Written exams for knowledge retention
  • Practical exercises for skill application
  • On-the-job observations for competency validation
  • Feedback surveys for training quality

Competency Validation:

  • Chief Data Officers: Must correctly assess 5 sample datasets with 100% accuracy before independent assessment
  • Data Scientists: Must demonstrate understanding of bias detection and fairness metrics
  • All staff: Must pass knowledge assessments with minimum required scores

Training Metrics:

MetricTargetFrequency
Training completion rate100%Quarterly
Assessment pass rate (first attempt)≥ 90%Per training
Training effectiveness score (survey)≥ 4.0/5.0Per training
Time to competency (Chief Data Officers)< 45 daysPer person

10.6 Training Records

Records Maintained:

  • Training attendance records
  • Assessment scores
  • Competency validations
  • Refresher training completion
  • Individual training transcripts

Retention: 10 years (to align with EU AI Act documentation retention)

Access: HR, Chief Data Officer, Internal Audit, Competent Authorities (upon request)


DEFINITIONS

TermDefinitionSource
Training DataData used to train AI model to learn patternsThis Standard
Validation DataData used to tune model hyperparameters during trainingThis Standard
Testing DataData used to evaluate final model performanceThis Standard
Data QualityDegree to which data meets requirements for intended useISO/IEC 5259 series
RepresentativenessExtent to which data reflects real-world population/scenariosEU AI Act Article 10(3)
BiasSystematic error or unfairness in data or model outputsEU AI Act Article 10(2)(f) and (g)
Data LineageComplete history of data from origin to current stateThis Standard
Special Category DataPersonal data revealing racial/ethnic origin, political opinions, religious beliefs, health, sex life, etc.GDPR Article 9
Data ProvenanceInformation about the origin, ownership, and history of dataThis Standard
Data MinimizationPrinciple of collecting and processing only data necessary for purposeGDPR Article 5(1)(c)
AnonymizationProcess of removing all identifiers from data permanentlyGDPR
PseudonymizationProcess of replacing identifiers with pseudonyms (reversible with key)GDPR Article 4(5)
Fairness MetricQuantitative measure of fairness across protected characteristicsThis Standard
Demographic ParityEqual positive prediction rate across groupsThis Standard
Equal OpportunityEqual true positive rate across groupsThis Standard
DPIAData Protection Impact Assessment per GDPR Article 35GDPR Article 35
Data CatalogCentralized repository of metadata about all datasetsThis Standard

LINK WITH AI ACT AND ISO42001

12.1 EU AI Act Regulatory Mapping

This standard implements the following EU AI Act requirements:

EU AI Act ProvisionArticleRequirement SummaryImplemented By (Controls)
Data and Data GovernanceArticle 10Requirements for training, validation, testing dataAll controls (DATA-001 through DATA-015)
Training, Validation, Testing Data QualityArticle 10(2)Quality requirements for datasetsDATA-001, DATA-002, DATA-003
Data RelevanceArticle 10(3)Datasets relevant and representativeDATA-004, DATA-005, DATA-006
Data Examination for BiasArticle 10(2)(f) and (g)Examine datasets for bias and apply measures to detect, prevent and mitigate biasesDATA-007, DATA-008, DATA-009
Contextual RelevanceArticle 10(4)Datasets must account for characteristics particular to the specific geographical, contextual, behavioural or functional settingDATA-004
Special Category Data ProcessingArticle 10(5)Requirements for processing special category dataDATA-013, DATA-014, DATA-015

12.2 GDPR Alignment

This standard aligns with GDPR requirements:

GDPR ProvisionRequirementImplementation in This Standard
Article 5(1)(c): Data MinimizationCollect only necessary dataDATA-014
Article 9: Special Category DataRequirements for processing special category dataDATA-013, DATA-015
Article 25: Data Protection by DesignPrivacy protection techniquesDATA-015
Article 35: DPIAData Protection Impact AssessmentDATA-013
Article 30: Records of ProcessingDocumentation of data processingDATA-010, DATA-012

12.3 ISO/IEC 42001:2023 Alignment

This standard aligns with ISO/IEC 42001:2023 as follows:

ISO 42001 ClauseRequirementImplementation in This Standard
Clause 6.1.2: AI system impact assessmentAssess data-related impactsDATA-004, DATA-005, DATA-007
Clause 7.5: Documented informationMaintain data documentationDATA-010, DATA-012
Clause 8.2: AI system risk assessmentData-related risk assessmentDATA-007, DATA-008, DATA-009
Clause 9.1: Monitoring and measurementMonitor data qualityDATA-003

12.4 ISO/IEC 5259 Series Alignment

This standard aligns with ISO/IEC 5259 series (Data Quality for AI) as follows:

ISO 5259 StandardRequirementImplementation in This Standard
ISO/IEC 5259-1: OverviewData quality frameworkDATA-001, DATA-002
ISO/IEC 5259-2: Data quality measuresQuality metrics and measurementDATA-002, DATA-003
ISO/IEC 5259-3: Data quality managementQuality management processDATA-001, DATA-002, DATA-003

12.5 Relationship to Other Standards

This data governance standard integrates with other AI Act standards:

Related StandardIntegration PointRationale
STD-AI-002: Risk ManagementBias risk assessment (RM-006) uses data governance outputsData bias analysis feeds into risk assessment
STD-AI-004: Technical DocumentationData requirements documented in technical documentationData governance outputs feed into Annex IV documentation
STD-AI-006: TransparencyData limitations communicated to usersData quality and bias limitations in transparency notices

12.6 References and Related Documents

EU AI Act (Regulation (EU) 2024/1689):

  • Article 10: Data and Data Governance
  • Article 10(2): Training, Validation, Testing Data
  • Article 10(3): Data Relevance
  • Article 10(2)(f) and (g): Data Examination for Bias
  • Article 10(4): Contextual Relevance of Datasets
  • Article 10(5): Exceptional Processing of Special Category Data for Bias Correction

GDPR (Regulation (EU) 2016/679):

  • Article 5(1)(c): Data Minimization
  • Article 9: Special Category Data
  • Article 25: Data Protection by Design
  • Article 35: Data Protection Impact Assessment

ISO/IEC Standards:

  • ISO/IEC 42001:2023: Information technology — Artificial intelligence — Management system
  • ISO/IEC 5259 series: Information technology — Data quality for AI and machine learning
  • ISO/IEC 23894:2023: Information technology — Artificial intelligence — Guidance on risk management

Internal Documents:

  • POL-AI-001: Artificial Intelligence Policy (parent policy)
  • STD-AI-002: AI Risk Management Standard
  • PROC-AI-DATA-001 through PROC-AI-DATA-004: Data governance procedures

APPROVAL AND AUTHORIZATION

RoleNameTitleSignatureDate
Prepared ByEmily WhiteChief Data Officer_________________________
Reviewed ByMichael BrownChief Legal Officer_________________________
Reviewed ByLisa AndersonData Protection Officer_________________________
Reviewed ByJane DoeChief Strategy & Risk Officer_________________________
Approved ByJane DoeAI Governance Committee Chair_________________________

Effective Date: 2025-08-01
Next Review Date: 2026-08-01
Review Frequency: Annually or upon regulatory change


END OF STANDARD STD-AI-003


This standard is a living document. Feedback and improvement suggestions should be directed to the Chief Data Officer.

Standard Details

Standard ID

STD-AI-003

Version

1.0

Status

draft

Owner

Chief Data Officer

Effective Date

2025-08-01

Applicability

All AI systems, mandatory for high-risk

EU AI Act References
Article 10
ISO 42001 Mapping
Clause 7.5