AI Data Governance Standard
Ensures AI training, validation, and testing datasets meet quality, relevance, and representativeness standards.
3
Controls
0
Compliant
3
In Progress
0
Not Started
AI Data Governance Standard
Document Type: Standard
Standard ID: STD-AI-003
Standard Title: AI Data Governance Standard
Version: 1.0
Effective Date: 2025-08-01
Next Review Date: 2026-08-01
Review Frequency: Annually or upon regulatory change
Parent Policy: POL-AI-001 - Artificial Intelligence Policy
Owner: Chief Data Officer
Approved By: AI Governance Committee Chair
Status: Draft
Classification: Internal Use Only
TABLE OF CONTENTS
- Document History
- Objective
- Scope and Applicability
- Control Standard
- Supporting Procedures
- Compliance
- Roles and Responsibilities
- Exceptions
- Enforcement
- Key Performance Indicators (KPIs)
- Training Requirements
- Definitions
- Link with AI Act and ISO42001
DOCUMENT HISTORY
| Version | Date | Author | Changes | Approval Date | Approved By |
|---|---|---|---|---|---|
| 0.1 | 2025-06-20 | Emily White, Chief Data Officer | Initial draft | - | - |
| 0.2 | 2025-07-05 | Emily White, Chief Data Officer | Added bias mitigation details | - | - |
| 0.3 | 2025-07-22 | Emily White, Chief Data Officer | Incorporated DPO feedback | - | - |
| 1.0 | 2025-08-01 | Emily White, Chief Data Officer | Final version approved - GRC restructured | 2025-07-25 | Jane Doe, AI Governance Committee Chair |
OBJECTIVE
This standard defines requirements for AI training, validation, and testing datasets to ensure they meet quality, relevance, representativeness, and bias mitigation standards in compliance with EU AI Act Article 10.
Primary Goals:
- Ensure all AI datasets meet quality standards appropriate for intended purpose
- Ensure datasets are relevant, representative, and appropriate for intended purpose
- Detect and mitigate biases in datasets to prevent discriminatory outcomes
- Establish comprehensive data governance with full lineage tracking
- Ensure AI data governance complies with GDPR and privacy requirements
SCOPE AND APPLICABILITY
2.1 Mandatory Applicability
This standard is mandatory for:
- All high-risk AI systems (EU AI Act Article 10)
- All data used for AI system development, training, validation, testing, and operation
2.2 Recommended Applicability
This standard is recommended for:
- All AI systems using training data
- Limited-risk and minimal-risk AI systems (voluntary best practices)
2.3 Data Types Covered
- Training datasets
- Validation datasets
- Testing datasets
- Input data (operational)
- Synthetic data
- Third-party datasets
- Open-source datasets
2.4 Out of Scope
- General enterprise data governance (covered by enterprise data governance framework)
- Non-AI system data (covered by other data governance standards)
- Data outside EU AI Act scope
CONTROL STANDARD
Control DATA-001: Data Quality Requirements Definition
Control ID: DATA-001
Control Name: Data Quality Requirements Definition
Control Type: Preventive
Control Frequency: Per AI system, annually
Risk Level: High
Control Objective
Define specific data quality requirements for each AI system based on intended purpose and risk level to ensure datasets meet appropriate quality standards before use in AI training, validation, and testing, in compliance with EU AI Act Article 10(2).
Control Requirements
CR-001.1: Data Quality Requirements Documentation
Define and document data quality requirements for each AI system with specific thresholds based on risk level.
Data Quality Dimensions:
| Dimension | Definition | High-Risk AI Requirement | Limited/Minimal-Risk AI Requirement |
|---|---|---|---|
| Accuracy | Correctness of data values | ≥95% | ≥90% |
| Completeness | Presence of all required data | ≥98% | ≥95% |
| Consistency | Uniformity across datasets | 100% for critical fields | ≥95% for critical fields |
| Timeliness | Data currency and freshness | Defined per use case | Defined per use case |
| Validity | Conformance to defined formats | 100% | ≥98% |
| Uniqueness | No unintended duplicates | ≥99% | ≥95% |
Mandatory Actions:
- Define quality requirements per AI system
- Document quality thresholds in Data Quality Requirements Document
- Obtain AI System Owner approval
- Review and update annually
- Align with intended purpose and risk level
Evidence Required:
- Data Quality Requirements Document (DOC-AI-DATA-001)
- Quality thresholds by AI system
- Approval records
- Annual review records
Audit Verification:
- Verify quality requirements defined for all AI systems
- Confirm thresholds appropriate for risk level
- Check approval obtained
- Validate annual review completed
Control DATA-002: Data Quality Assessment
Control ID: DATA-002
Control Name: Data Quality Assessment and Validation
Control Type: Preventive
Control Frequency: Before training, after significant data updates
Risk Level: High
Control Objective
Assess data quality against defined requirements before use in AI training to ensure datasets meet quality thresholds and prevent quality issues from affecting AI system performance and compliance.
Control Requirements
CR-002.1: Pre-Training Quality Assessment
Conduct comprehensive data quality assessment using multiple methods before datasets are used for AI training.
Assessment Methods:
- Automated data profiling
- Statistical analysis
- Manual sampling and review
- Data quality tools
Mandatory Actions:
- Profile datasets to assess all quality dimensions
- Calculate quality metrics for each dimension
- Compare against defined thresholds
- Document assessment results
- Remediate quality issues before use
- Obtain quality approval before training begins
Quality Assessment Checklist:
- Accuracy assessed (correctness of values)
- Completeness assessed (presence of required data)
- Consistency assessed (uniformity across datasets)
- Timeliness assessed (data currency)
- Validity assessed (format conformance)
- Uniqueness assessed (duplicate detection)
- All metrics meet thresholds
- Quality issues identified and remediated
- Quality approval obtained
Evidence Required:
- Data Quality Assessment Report (RPT-AI-DATA-001)
- Quality metrics dashboard
- Remediation records
- Quality approval records
Audit Verification:
- Verify quality assessment conducted before training
- Confirm all quality dimensions assessed
- Check metrics meet defined thresholds
- Validate quality issues remediated before use
- Verify quality approval obtained
Control DATA-003: Data Quality Monitoring
Control ID: DATA-003
Control Name: Continuous Data Quality Monitoring
Control Type: Detective
Control Frequency: Continuous
Risk Level: Medium
Control Objective
Continuously monitor data quality during AI system operation to detect quality degradation, identify quality issues early, and enable timely remediation to maintain AI system performance and compliance.
Control Requirements
CR-003.1: Operational Quality Monitoring
Implement automated quality monitoring with alerting and periodic reporting.
Mandatory Actions:
- Implement automated quality checks
- Monitor quality metrics in real-time
- Alert on quality threshold breaches
- Investigate quality issues promptly
- Remediate quality issues
- Report quality trends monthly
Quality Monitoring Metrics:
| Metric | Threshold | Alert Level | Frequency |
|---|---|---|---|
| Accuracy | < 95% (high-risk) / < 90% (other) | Warning | Daily |
| Completeness | < 98% (high-risk) / < 95% (other) | Warning | Daily |
| Consistency | < 100% (critical fields) | Critical | Real-time |
| Validity | < 100% | Critical | Real-time |
| Uniqueness | < 99% (high-risk) / < 95% (other) | Warning | Daily |
Evidence Required:
- Quality monitoring dashboard
- Alert logs
- Remediation records
- Monthly quality reports (RPT-AI-DATA-QM-XXX)
Audit Verification:
- Verify automated quality monitoring implemented
- Confirm alerting configured for threshold breaches
- Check quality issues investigated and remediated
- Validate monthly quality reports generated
Control DATA-004: Data Relevance Assessment
Control ID: DATA-004
Control Name: Data Relevance and Purpose Alignment Assessment
Control Type: Preventive
Control Frequency: Per AI system, annually
Risk Level: High
Control Objective
Assess and document that datasets are relevant to intended purpose and geographical/behavioral/functional setting to ensure AI systems are trained on appropriate data that reflects their actual deployment context, in compliance with EU AI Act Article 10(3) and 10(4).
Control Requirements
CR-004.1: Comprehensive Relevance Assessment
Conduct relevance assessment across all relevant dimensions.
Relevance Criteria:
| Criterion | Assessment Questions | Evidence Required |
|---|---|---|
| Purpose Alignment | Does data reflect intended use case? | Use case to data mapping |
| Geographical Relevance | Is data from relevant locations? | Geographic distribution analysis |
| Temporal Relevance | Is data from relevant time period? | Temporal distribution analysis |
| Behavioral Relevance | Does data reflect actual user behaviors? | Behavioral pattern analysis |
| Functional Relevance | Does data cover all system functions? | Function coverage analysis |
Mandatory Actions:
- Define intended purpose clearly
- Identify target population/scenarios
- Assess dataset coverage of target
- Document relevance justification
- Obtain stakeholder approval
- Review relevance annually
Evidence Required:
- Data Relevance Assessment (ASSESS-AI-DATA-001)
- Purpose-to-data mapping
- Stakeholder approval records
- Annual review records
Audit Verification:
- Verify relevance assessment conducted for all AI systems
- Confirm all relevance criteria assessed
- Check relevance justification documented
- Validate stakeholder approval obtained
Control DATA-005: Representativeness Analysis
Control ID: DATA-005
Control Name: Dataset Representativeness Assessment
Control Type: Preventive
Control Frequency: Before training, annually
Risk Level: High
Control Objective
Ensure datasets are sufficiently representative of all persons/situations AI system will encounter to prevent bias and ensure fair treatment across all user groups and scenarios, in compliance with EU AI Act Article 10(3).
Control Requirements
CR-005.1: Comprehensive Representativeness Analysis
Analyze dataset representativeness across all relevant subpopulations and protected characteristics.
Representativeness Metrics:
| Metric | Definition | Target |
|---|---|---|
| Demographic Parity | Equal representation across protected characteristics | Ratio 0.8-1.2 |
| Coverage | % of target population represented | ≥90% |
| Balance | Distribution similarity to real-world | Within 10% |
| Diversity | Variety of scenarios/cases included | Comprehensive |
Protected Characteristics to Assess:
- Gender
- Age groups
- Ethnicity/race (where legally permissible)
- Disability status
- Geographic location
- Socioeconomic status
Mandatory Actions:
- Identify all relevant subpopulations
- Analyze dataset distribution across subpopulations
- Calculate representativeness metrics
- Compare to target population distribution
- Address underrepresentation
- Document representativeness analysis
Evidence Required:
- Representativeness Analysis Report (RPT-AI-DATA-002)
- Distribution comparisons
- Gap analysis
- Mitigation plans (if underrepresentation identified)
Audit Verification:
- Verify representativeness analysis conducted before training
- Confirm all protected characteristics assessed
- Check representativeness metrics calculated
- Validate underrepresentation addressed
- Verify mitigation plans created if needed
Control DATA-006: Dataset Appropriateness Evaluation
Control ID: DATA-006
Control Name: Dataset Selection and Appropriateness Assessment
Control Type: Preventive
Control Frequency: Per AI system, when changing datasets
Risk Level: Medium
Control Objective
Ensure datasets are appropriate considering state of the art and available alternatives to ensure optimal dataset selection and justify dataset choices for AI system development.
Control Requirements
CR-006.1: Dataset Appropriateness Justification
Evaluate and justify dataset selection against available alternatives and state of the art.
Appropriateness Factors:
| Factor | Evaluation Criteria | Documentation Required |
|---|---|---|
| Quality vs. Alternatives | How does quality compare to other datasets? | Quality comparison analysis |
| Representativeness vs. Alternatives | How does representativeness compare? | Representativeness comparison |
| Cost vs. Value | Is cost justified by value? | Cost-benefit analysis |
| Privacy Implications | What are privacy risks? | Privacy impact assessment |
| Bias Risks | What are bias risks compared to alternatives? | Bias risk comparison |
| State of the Art | Does dataset align with industry best practices? | State of the art review |
Mandatory Actions:
- Research available datasets
- Compare dataset options
- Justify dataset selection
- Document why chosen dataset is appropriate
- Consider synthetic data alternatives
- Obtain approval for dataset selection
Evidence Required:
- Dataset Selection Justification (DOC-AI-DATA-002)
- Alternative analysis
- State of the art review
- Approval records
Audit Verification:
- Verify dataset appropriateness evaluated
- Confirm alternatives considered
- Check justification documented
- Validate approval obtained
Control DATA-007: Bias Examination
Control ID: DATA-007
Control Name: Bias Detection and Assessment
Control Type: Preventive
Control Frequency: Before training, after dataset updates
Risk Level: High
Control Objective
Examine training, validation, and testing datasets for possible biases to identify discriminatory patterns before they are learned by AI models, preventing bias propagation to AI system outputs, in compliance with EU AI Act Article 10(2)(f) and (g).
Control Requirements
CR-007.1: Comprehensive Bias Assessment
Conduct bias assessment across all bias types and protected characteristics.
Bias Types to Assess:
| Bias Type | Description | Detection Method |
|---|---|---|
| Historical Bias | Data reflects past discrimination | Compare to equitable baseline |
| Representation Bias | Underrepresentation of groups | Distribution analysis |
| Measurement Bias | Biased labels or features | Label quality review |
| Aggregation Bias | Inappropriate grouping | Subgroup analysis |
| Evaluation Bias | Biased test data | Test set representativeness |
Mandatory Actions:
- Conduct bias assessment before training
- Use statistical methods to detect bias
- Use bias detection tools (e.g., IBM AI Fairness 360, Aequitas)
- Document bias findings
- Assess bias impact on fairness
- Assess all protected characteristics
Evidence Required:
- Bias Assessment Report (RPT-AI-DATA-003)
- Bias metrics by protected characteristic
- Bias detection tool outputs
- Impact assessment
Audit Verification:
- Verify bias assessment conducted before training
- Confirm all bias types assessed
- Check bias detection tools used
- Validate bias findings documented
- Verify protected characteristics assessed
Control DATA-008: Bias Mitigation
Control ID: DATA-008
Control Name: Bias Mitigation Implementation
Control Type: Corrective
Control Frequency: After bias detection
Risk Level: High
Control Objective
Implement appropriate measures to mitigate detected biases in datasets to reduce discriminatory outcomes and improve fairness across all protected characteristics.
Control Requirements
CR-008.1: Bias Mitigation Strategy Selection and Implementation
Select and implement appropriate bias mitigation strategies based on bias type and context.
Mitigation Strategies:
| Strategy | When to Use | Implementation | Effectiveness Validation |
|---|---|---|---|
| Data Rebalancing | Representation bias | Oversample underrepresented groups | Post-mitigation distribution analysis |
| Data Augmentation | Insufficient diversity | Generate synthetic samples | Diversity metrics |
| Feature Engineering | Biased features | Remove or transform biased features | Feature importance analysis |
| Reweighting | Imbalanced classes | Assign weights to balance influence | Weighted performance metrics |
| Relabeling | Label bias | Review and correct biased labels | Label quality review |
| Fairness Constraints | Model-level bias | Apply fairness constraints during training | Fairness metrics validation |
Mandatory Actions:
- Select appropriate mitigation strategies
- Implement mitigation measures
- Validate mitigation effectiveness
- Document mitigation actions
- Monitor for residual bias
- Obtain approval for mitigation approach
Evidence Required:
- Bias Mitigation Plan (PLAN-AI-DATA-001)
- Mitigation implementation records
- Post-mitigation bias assessment
- Effectiveness validation records
- Approval records
Audit Verification:
- Verify mitigation strategies selected for all identified biases
- Confirm mitigation measures implemented
- Check mitigation effectiveness validated
- Validate mitigation actions documented
- Verify residual bias monitored
Control DATA-009: Fairness Validation
Control ID: DATA-009
Control Name: Fairness Metrics Validation
Control Type: Detective
Control Frequency: After mitigation, before deployment
Risk Level: High
Control Objective
Validate that bias mitigation achieves fairness objectives by calculating and verifying fairness metrics meet defined thresholds, ensuring AI systems treat all groups fairly across protected characteristics.
Control Requirements
CR-009.1: Fairness Metrics Calculation and Validation
Calculate fairness metrics and validate they meet defined thresholds.
Fairness Metrics:
| Metric | Definition | Target | Measurement |
|---|---|---|---|
| Demographic Parity | Equal positive prediction rate across groups | Ratio 0.8-1.2 | (Positive rate group A) / (Positive rate group B) |
| Equal Opportunity | Equal true positive rate across groups | Ratio 0.8-1.2 | (TPR group A) / (TPR group B) |
| Equalized Odds | Equal TPR and FPR across groups | Ratio 0.8-1.2 | (TPR/FPR group A) / (TPR/FPR group B) |
| Predictive Parity | Equal precision across groups | Ratio 0.8-1.2 | (Precision group A) / (Precision group B) |
| Calibration | Equal calibration across groups | Within 5% | Calibration difference between groups |
Mandatory Actions:
- Calculate fairness metrics for all protected characteristics
- Compare to fairness thresholds
- Assess trade-offs (accuracy vs. fairness)
- Document fairness validation
- Obtain approval for fairness-accuracy balance
- Block deployment if fairness thresholds not met
Evidence Required:
- Fairness Validation Report (RPT-AI-DATA-004)
- Fairness metrics dashboard
- Trade-off analysis
- Approval records
Audit Verification:
- Verify fairness metrics calculated for all protected characteristics
- Confirm metrics meet defined thresholds
- Check trade-off analysis documented
- Validate approval obtained
- Verify deployment blocked if thresholds not met
Control DATA-010: Data Lineage Documentation
Control ID: DATA-010
Control Name: Data Lineage Tracking and Documentation
Control Type: Detective
Control Frequency: Continuous updates
Risk Level: Medium
Control Objective
Document complete data lineage from source to AI model to enable traceability, support audits, facilitate troubleshooting, and demonstrate data governance compliance.
Control Requirements
CR-010.1: Complete Lineage Documentation
Document all lineage elements for every dataset used in AI systems.
Lineage Elements:
| Element | Description | Required Info |
|---|---|---|
| Data Source | Original data origin | Source system, provider, date, contact |
| Collection Method | How data was collected | Method, tools, responsible party, date |
| Transformations | All data processing steps | Transformation type, date, person, tools |
| Quality Checks | Quality assessments performed | Check type, results, date, person |
| Versions | Dataset versions | Version number, changes, date, changelog |
| Usage | Which AI systems use data | AI system ID, purpose, date, status |
Mandatory Actions:
- Document data lineage for all datasets
- Maintain lineage in data catalog
- Update lineage with each transformation
- Enable lineage traceability (source to model)
- Provide lineage reports on demand
- Maintain lineage for 10 years
Evidence Required:
- Data Lineage Documentation (DOC-AI-DATA-003)
- Data catalog with lineage
- Lineage diagrams
- Transformation logs
- Lineage reports
Audit Verification:
- Verify lineage documented for all datasets
- Confirm all lineage elements captured
- Check lineage updated with transformations
- Validate traceability enabled
- Verify 10-year retention
Control DATA-011: Data Provenance Verification
Control ID: DATA-011
Control Name: Data Provenance and Legal Compliance
Control Type: Preventive
Control Frequency: Per dataset acquisition
Risk Level: High
Control Objective
Establish and verify data provenance for all AI datasets to ensure legal compliance, protect intellectual property, and enable regulatory compliance, in compliance with GDPR and data protection requirements.
Control Requirements
CR-011.1: Comprehensive Provenance Verification
Verify and document all provenance requirements for each dataset.
Provenance Requirements:
| Requirement | Verification Method | Documentation Required |
|---|---|---|
| Source Verification | Confirm data source authenticity | Source verification records |
| Licensing | Document data usage rights | License agreements, terms |
| Consent | Verify consent for data use (if personal data) | Consent records, consent management logs |
| Legal Basis | Document legal basis for processing (GDPR) | Legal basis documentation |
| Third-Party Agreements | Maintain data sharing agreements | Data sharing agreements, contracts |
Mandatory Actions:
- Verify data source for all datasets
- Document licensing terms
- Obtain and document consent (where required)
- Establish legal basis for processing
- Maintain third-party agreements
- Conduct provenance audits annually
Evidence Required:
- Data Provenance Records (REC-AI-DATA-001)
- Licenses and agreements
- Consent records
- Legal basis documentation
- Third-party agreement records
- Annual provenance audit reports
Audit Verification:
- Verify provenance verified for all datasets
- Confirm licensing terms documented
- Check consent obtained where required
- Validate legal basis established
- Verify third-party agreements maintained
- Check annual provenance audits completed
Control DATA-012: Data Catalog Management
Control ID: DATA-012
Control Name: Data Catalog Maintenance and Discovery
Control Type: Detective
Control Frequency: Continuous
Risk Level: Medium
Control Objective
Maintain comprehensive data catalog for all AI datasets to enable data discovery, support data governance, facilitate compliance, and enable efficient data management.
Control Requirements
CR-012.1: Data Catalog Completeness and Maintenance
Maintain complete, current, and searchable data catalog for all AI datasets.
Catalog Contents:
| Field | Description | Mandatory |
|---|---|---|
| Dataset name and ID | Unique identifier and name | YES |
| Description and purpose | What data is and why it exists | YES |
| Data schema | Structure and format | YES |
| Data quality metrics | Current quality scores | YES |
| Lineage information | Complete lineage | YES |
| Provenance records | Source, licensing, consent | YES |
| Access controls | Who can access and how | YES |
| Usage tracking | Which AI systems use it | YES |
| Retention period | How long data is kept | YES |
| Related AI systems | Links to AI systems | YES |
Mandatory Actions:
- Create catalog entries for all datasets
- Update catalog continuously
- Enable search and discovery
- Provide metadata management
- Integrate with data governance tools
- Maintain catalog accuracy
Evidence Required:
- Data Catalog (CATALOG-AI-DATA-001)
- Catalog completeness metrics
- Usage reports
- Catalog update logs
Audit Verification:
- Verify catalog entries for all datasets
- Confirm all mandatory fields populated
- Check catalog updated continuously
- Validate search and discovery enabled
- Verify catalog accuracy
Control DATA-013: Privacy Impact Assessment
Control ID: DATA-013
Control Name: Data Protection Impact Assessment (DPIA)
Control Type: Preventive
Control Frequency: Per high-risk AI system, after substantial modifications
Risk Level: High
Control Objective
Conduct Data Protection Impact Assessment (DPIA) for high-risk AI systems processing personal data to identify and mitigate privacy risks, ensure GDPR compliance, and protect data subject rights.
Control Requirements
CR-013.1: DPIA Execution and Documentation
Conduct comprehensive DPIA per GDPR Article 35 for all high-risk AI systems processing personal data.
DPIA Triggers:
- Systematic and extensive profiling
- Large-scale processing of special category data
- Systematic monitoring of publicly accessible areas
- High-risk AI system processing personal data
DPIA Contents (per GDPR Article 35):
- Systematic description of processing operations
- Assessment of necessity and proportionality
- Assessment of risks to rights and freedoms
- Measures to address risks
- Safeguards, security measures, and mechanisms
Mandatory Actions:
- Determine if DPIA required
- Conduct DPIA per GDPR Article 35
- Identify privacy risks
- Implement privacy controls
- Obtain DPO approval
- Consult supervisory authority if high risk
- Review DPIA after substantial modifications
Evidence Required:
- Data Protection Impact Assessment (DPIA-AI-XXX)
- Privacy risk assessment
- DPO approval records
- Supervisory authority consultation records (if applicable)
- Privacy control implementation records
Audit Verification:
- Verify DPIA conducted for all high-risk AI processing personal data
- Confirm DPIA contents complete per GDPR Article 35
- Check DPO approval obtained
- Validate privacy controls implemented
- Verify supervisory authority consulted if high risk
Control DATA-014: Data Minimization
Control ID: DATA-014
Control Name: Data Minimization and Purpose Limitation
Control Type: Preventive
Control Frequency: Per AI system, annually
Risk Level: Medium
Control Objective
Collect and process only data necessary for AI system purpose to comply with GDPR Article 5(1)(c) data minimization principle and reduce privacy risks.
Control Requirements
CR-014.1: Data Minimization Assessment and Implementation
Assess and implement data minimization for all AI systems processing personal data.
Mandatory Actions:
- Define minimum data requirements
- Justify each data element
- Remove unnecessary data
- Implement data minimization in pipelines
- Review data scope regularly (annually)
- Document minimization decisions
Data Minimization Assessment:
| Data Element | Purpose | Necessity Justification | Can Be Removed? | Action |
|---|---|---|---|---|
| [Example] | [Purpose] | [Justification] | Yes/No | Remove/Keep |
Evidence Required:
- Data Minimization Assessment (ASSESS-AI-DATA-002)
- Justification for each data element
- Data reduction records
- Annual review records
Audit Verification:
- Verify data minimization assessed for all AI systems
- Confirm each data element justified
- Check unnecessary data removed
- Validate data minimization implemented in pipelines
- Verify annual review completed
Control DATA-015: Data Anonymization and Pseudonymization
Control ID: DATA-015
Control Name: Privacy Protection Techniques
Control Type: Preventive
Control Frequency: Before training, for personal data
Risk Level: High
Control Objective
Apply appropriate anonymization or pseudonymization techniques to protect privacy while enabling AI system development, balancing privacy protection with data utility.
Control Requirements
CR-015.1: Privacy Protection Technique Selection and Implementation
Select and implement appropriate privacy protection techniques based on use case and privacy requirements.
Techniques:
| Technique | Use Case | Reversibility | Privacy Level | Data Utility |
|---|---|---|---|---|
| Anonymization | Remove all identifiers permanently | Irreversible | Highest | Medium |
| Pseudonymization | Replace identifiers with pseudonyms | Reversible with key | High | High |
| Aggregation | Group data to prevent identification | Irreversible | High | Low-Medium |
| Masking | Hide sensitive data elements | Varies | Medium-High | High |
| Synthetic Data | Generate artificial data | N/A | Highest | Medium-High |
Mandatory Actions:
- Assess need for anonymization/pseudonymization
- Select appropriate technique
- Implement technique
- Validate effectiveness
- Document approach
- Assess re-identification risk
Re-identification Risk Assessment:
- Assess risk of re-identifying individuals
- Consider available auxiliary information
- Evaluate attack scenarios
- Document risk level
- Implement additional protections if needed
Evidence Required:
- Anonymization/Pseudonymization Plan (PLAN-AI-DATA-002)
- Implementation records
- Effectiveness validation
- Re-identification risk assessment
- Privacy protection verification
Audit Verification:
- Verify privacy protection technique selected
- Confirm technique implemented
- Check effectiveness validated
- Validate re-identification risk assessed
- Verify approach documented
SUPPORTING PROCEDURES
This standard is implemented through the following detailed procedures:
Procedure PROC-AI-DATA-001: Data Quality Assessment Procedure
Purpose: Define step-by-step process for assessing and ensuring data quality
Owner: Chief Data Officer
Implements: Controls DATA-001, DATA-002, DATA-003
Procedure Steps:
- Define data quality requirements - Control DATA-001
- Profile datasets
- Calculate quality metrics - Control DATA-002
- Compare to thresholds
- Document assessment results
- Remediate quality issues
- Obtain quality approval
- Set up continuous monitoring - Control DATA-003
Outputs:
- Data Quality Requirements Document
- Data Quality Assessment Report
- Quality approval records
- Monitoring configuration
Procedure PROC-AI-DATA-002: Bias Detection and Mitigation Procedure
Purpose: Define process for detecting and mitigating data bias
Owner: Chief Data Officer
Implements: Controls DATA-007, DATA-008, DATA-009
Procedure Steps:
- Conduct bias examination - Control DATA-007
- Document bias findings
- Select mitigation strategies - Control DATA-008
- Implement mitigation measures
- Validate mitigation effectiveness
- Calculate fairness metrics - Control DATA-009
- Verify fairness thresholds met
- Obtain fairness approval
Outputs:
- Bias Assessment Report
- Bias Mitigation Plan
- Fairness Validation Report
- Approval records
Procedure PROC-AI-DATA-003: Data Lineage Documentation Procedure
Purpose: Define process for documenting data lineage
Owner: Chief Data Officer
Implements: Controls DATA-010, DATA-011, DATA-012
Procedure Steps:
- Document data source - Control DATA-011
- Document collection method
- Track all transformations - Control DATA-010
- Document quality checks
- Maintain version history
- Update data catalog - Control DATA-012
- Enable lineage traceability
- Generate lineage reports
Outputs:
- Data Lineage Documentation
- Data Catalog entries
- Lineage reports
Procedure PROC-AI-DATA-004: Training Data Preparation Procedure
Purpose: Define process for preparing datasets for AI training
Owner: Chief Data Officer
Implements: Controls DATA-004, DATA-005, DATA-006, DATA-013, DATA-014, DATA-015
Procedure Steps:
- Assess data relevance - Control DATA-004
- Analyze representativeness - Control DATA-005
- Evaluate dataset appropriateness - Control DATA-006
- Conduct privacy impact assessment - Control DATA-013
- Implement data minimization - Control DATA-014
- Apply privacy protection techniques - Control DATA-015
- Obtain all required approvals
- Prepare dataset for training
Outputs:
- Data Relevance Assessment
- Representativeness Analysis
- Dataset Appropriateness Justification
- DPIA
- Privacy protection verification
- Training dataset ready
COMPLIANCE
5.1 Compliance Monitoring
Monitoring Approach: Continuous automated monitoring supplemented by monthly manual reviews and quarterly comprehensive audits.
Compliance Metrics:
| Metric | Target | Measurement Method | Frequency | Owner |
|---|---|---|---|---|
| Data Quality Requirements Coverage | 100% | % of AI systems with documented quality requirements | Monthly | Chief Data Officer |
| Data Quality Threshold Compliance | ≥95% | % of datasets meeting quality thresholds | Monthly | Chief Data Officer |
| Bias Assessment Coverage | 100% | % of high-risk AI with bias assessment | Quarterly | Chief Data Officer |
| Fairness Metric Compliance | 100% | % of AI systems meeting fairness thresholds | Quarterly | Chief Data Officer |
| Lineage Completeness | 100% | % of datasets with complete lineage | Monthly | Chief Data Officer |
| DPIA Completion | 100% | % of required DPIAs completed | Quarterly | Data Protection Officer |
| Data Minimization Compliance | 100% | % of AI systems with minimization assessment | Quarterly | Data Protection Officer |
| Privacy Protection Implementation | 100% | % of personal data with privacy protection | Monthly | Data Protection Officer |
Monitoring Tools:
- Data Quality Dashboard
- Bias Detection Dashboard
- Data Catalog
- Privacy Compliance Dashboard
- Monthly compliance reports
- Quarterly AI Governance Committee reviews
5.2 Internal Audit Requirements
Audit Frequency: Annually (minimum)
Audit Scope:
- Data quality requirements completeness
- Data quality assessment quality
- Bias assessment completeness and quality
- Fairness validation accuracy
- Data lineage completeness
- Data provenance verification
- Data catalog completeness
- DPIA completeness and quality
- Data minimization implementation
- Privacy protection effectiveness
- Controls effectiveness (DATA-001 through DATA-015)
Audit Activities:
- Review 100% of data quality requirements
- Sample 20% of datasets for detailed quality review
- Review 100% of bias assessments
- Test fairness metric calculations
- Verify lineage completeness
- Check provenance records
- Review data catalog completeness
- Test privacy protection effectiveness
- Interview key personnel
Audit Outputs:
- Annual Data Governance Audit Report
- Findings and recommendations
- Corrective action plans for deficiencies
5.3 External Audit / Regulatory Inspection
Preparation:
- Maintain audit-ready data governance documentation at all times
- Designate Chief Data Officer and Data Protection Officer as regulatory liaisons
- Prepare standard response procedures for authority requests
Provide to Auditors/Regulators:
- Data Quality Requirements Documents
- Data Quality Assessment Reports
- Bias Assessment Reports
- Fairness Validation Reports
- Data Lineage Documentation
- Data Provenance Records
- Data Catalog
- DPIAs
- Data minimization assessments
- Privacy protection documentation
- Data governance procedures
- Internal audit reports
- Evidence of controls execution
Authority Request Response:
- Acknowledge request within 1 business day
- Provide requested documentation within 5 business days
- Coordinate through Legal, Chief Data Officer, and Data Protection Officer
- Document all interactions with authorities
ROLES AND RESPONSIBILITIES
6.1 RACI Matrix
| Activity | Chief Data Officer | Data Scientist | Data Engineer | Data Protection Officer | AI System Owner | AI Risk Manager | Legal |
|---|---|---|---|---|---|---|---|
| Data Quality Requirements | R/A | C | C | I | A | I | I |
| Data Quality Assessment | R | R | C | I | A | I | I |
| Data Quality Monitoring | R | C | R | I | A | I | I |
| Data Relevance Assessment | R | R | I | I | A | C | I |
| Representativeness Analysis | R | R | I | I | A | R | I |
| Dataset Appropriateness | R | R | I | C | A | I | C |
| Bias Examination | R | R | I | I | A | R | I |
| Bias Mitigation | R | R | C | I | A | C | I |
| Fairness Validation | R | R | I | I | A | R/A | I |
| Data Lineage Documentation | R | C | R | I | A | I | I |
| Data Provenance | R | I | I | R | A | I | R |
| Data Catalog Management | R/A | C | C | I | I | I | I |
| DPIA | C | I | I | R/A | A | C | C |
| Data Minimization | R | C | C | R | A | I | C |
| Privacy Protection | R | C | C | R/A | A | I | C |
RACI Legend:
- R = Responsible (does the work)
- A = Accountable (ultimately answerable)
- C = Consulted (provides input)
- I = Informed (kept up-to-date)
6.2 Role Descriptions
Chief Data Officer
- Primary Responsibility: Oversees data governance framework, ensures compliance
- Key Activities:
- Establishes data governance framework
- Maintains data catalog
- Coordinates data quality management
- Oversees bias detection and mitigation
- Reports data governance metrics
- Required Competencies: Data governance expertise, EU AI Act Article 10, GDPR, data quality management
Data Scientist
- Primary Responsibility: Assesses data quality, bias, representativeness; prepares datasets
- Key Activities:
- Conducts data quality assessments
- Performs bias examinations
- Analyzes representativeness
- Implements bias mitigation
- Calculates fairness metrics
- Required Competencies: Data science, statistics, bias detection, fairness metrics
Data Engineer
- Primary Responsibility: Implements data pipelines; maintains lineage
- Key Activities:
- Implements data quality checks
- Maintains data lineage
- Updates data catalog
- Implements privacy protection techniques
- Required Competencies: Data engineering, ETL, data pipeline management
Data Protection Officer (DPO)
- Primary Responsibility: Ensures GDPR compliance; conducts DPIAs
- Key Activities:
- Conducts DPIAs
- Verifies data provenance
- Ensures data minimization
- Validates privacy protection
- Manages consent
- Required Competencies: GDPR expertise, privacy impact assessment, data protection
AI System Owner
- Primary Responsibility: Accountable for data governance of their AI system
- Key Activities:
- Defines data requirements
- Approves datasets
- Approves data quality requirements
- Participates in data assessments
- Required Competencies: AI system knowledge, data requirements definition
AI Risk Manager
- Primary Responsibility: Assesses bias risks; validates fairness
- Key Activities:
- Assesses bias risks
- Validates fairness metrics
- Reviews representativeness
- Integrates with risk management
- Required Competencies: Risk management, bias assessment, fairness validation
Legal
- Primary Responsibility: Advises on legal compliance; reviews data agreements
- Key Activities:
- Reviews data licensing
- Verifies legal basis
- Reviews data sharing agreements
- Advises on GDPR compliance
- Required Competencies: GDPR legal expertise, data licensing, contract law
EXCEPTIONS
7.1 Exception Philosophy
Data governance is a critical regulatory compliance activity for high-risk AI systems. Exceptions are granted restrictively and only where compensating controls adequately mitigate risks.
7.2 Allowed Exceptions
The following exceptions may be granted with proper justification and approval:
| Exception Type | Justification Required | Maximum Duration | Approval Authority | Compensating Controls |
|---|---|---|---|---|
| Simplified Quality Assessment (Minimal-Risk AI) | AI system clearly minimal-risk; no significant quality concerns | Permanent | Chief Data Officer | Document simplified rationale; Annual re-confirmation |
| Extended Lineage Documentation Timeline | Technical constraints prevent immediate lineage documentation | 30 days | Chief Data Officer | Interim documentation; Accelerated implementation plan |
| Deferred Bias Assessment (Non-Personal Data) | Dataset contains no personal data; no protected characteristics | Until personal data added | Chief Data Officer + AI Risk Manager | Document rationale; Re-assess if personal data added |
7.3 Prohibited Exceptions
The following exceptions cannot be granted under any circumstances:
❌ Skipping data quality assessment for high-risk AI - Mandatory per Article 10(2), no exceptions
❌ Skipping bias assessment for high-risk AI - Mandatory per Article 10(2)(f) and (g), no exceptions
❌ Using datasets below quality thresholds - Creates compliance gaps, performance risk
❌ Skipping DPIA for high-risk AI processing personal data - Mandatory per GDPR Article 35
❌ Operating without data lineage - Required for traceability and compliance
7.4 Exception Request Process
Step 1: Submit Exception Request
- Complete Exception Request Form (FORM-AI-EXCEPTION-001)
- Include business justification
- Propose compensating controls
- Specify duration requested
- Attach risk assessment
Step 2: Risk Assessment
- Chief Data Officer assesses risk of granting exception
- Evaluates adequacy of compensating controls
- Documents residual risk
Step 3: Approval
- Route to appropriate approval authority based on exception type
- Chief Data Officer approval: Minor exceptions
- Chief Data Officer + AI Risk Manager: Significant exceptions
- AI Governance Committee: Critical exceptions
Step 4: Documentation and Monitoring
- Document exception in Exception Register
- Assign exception owner
- Set review date
- Monitor compensating controls
- Report exceptions quarterly to AI Governance Committee
Step 5: Exception Review and Closure
- Review exception at specified review date
- Assess if exception still needed
- Close exception when normal data governance completed
- Document lessons learned
ENFORCEMENT
8.1 Non-Compliance Consequences
| Violation | Severity | Consequence | Remediation Required |
|---|---|---|---|
| High-risk AI without data quality assessment | Critical | Immediate suspension until assessment completed | Complete assessment within 10 business days; Root cause analysis |
| High-risk AI without bias assessment | Critical | Immediate suspension until assessment completed | Complete bias assessment within 10 business days |
| Using datasets below quality thresholds | High | Halt training until quality issues resolved | Remediate quality issues; Re-assess; Re-approve |
| Missing DPIA for high-risk AI | Critical | Immediate suspension; GDPR violation | Complete DPIA within 15 business days; Implement privacy controls |
| Missing data lineage | Medium | Written warning; Escalation to management | Complete lineage documentation within 10 business days |
| Fairness thresholds not met | High | Block deployment until fairness achieved | Implement additional mitigation; Re-validate fairness |
| Missing data minimization assessment | Medium | Written warning | Complete assessment within 10 business days |
8.2 Escalation Procedures
Level 1: Chief Data Officer
- Minor procedural violations
- Documentation deficiencies
- Timeline delays < 5 days
- Action: Written warning, corrective action required
Level 2: Chief Data Officer + AI Risk Manager
- Repeated violations
- Missing bias assessments
- Quality threshold breaches
- Action: Formal review, corrective action plan, management notification
Level 3: AI Governance Committee
- High-risk AI without data quality/bias assessment
- Missing DPIA
- GDPR violations
- Action: Immediate AI system suspension, investigation, disciplinary action
Level 4: Executive Management + Legal + DPO
- Potential regulatory enforcement action
- Significant GDPR violations
- Reputational risk
- Action: Executive crisis management, legal strategy, regulatory engagement
8.3 Immediate Escalation Triggers
Escalate immediately to AI Governance Committee + Legal + DPO if:
- ⚠️ High-risk AI system operating without data quality/bias assessment
- ⚠️ GDPR violation identified (missing DPIA, unauthorized processing)
- ⚠️ Data breach or privacy incident
- ⚠️ Regulatory inquiry or inspection related to data governance
- ⚠️ Significant fairness violations identified in production
8.4 Disciplinary Actions
Individuals responsible for data governance violations may be subject to:
- Verbal or written warning
- Mandatory retraining
- Performance improvement plan
- Reassignment of responsibilities
- Suspension (with pay during investigation)
- Termination (for egregious violations, e.g., knowingly using biased data)
Factors Considered:
- Intent (knowing violation vs. honest mistake)
- Severity of violation
- Impact (actual or potential)
- Cooperation with remediation
- Prior violation history
KEY PERFORMANCE INDICATORS (KPIs)
9.1 Data Governance KPIs
| KPI ID | KPI Name | Definition | Target | Measurement Method | Frequency | Owner | Reporting To |
|---|---|---|---|---|---|---|---|
| KPI-DATA-001 | Data Quality Requirements Coverage | % of AI systems with documented quality requirements | 100% | (# AI systems with requirements / # total AI systems) × 100 | Monthly | Chief Data Officer | AI Governance Committee |
| KPI-DATA-002 | Data Quality Threshold Compliance | % of datasets meeting quality thresholds | ≥95% | (# datasets meeting thresholds / # total datasets) × 100 | Monthly | Chief Data Officer | Management |
| KPI-DATA-003 | Bias Assessment Coverage | % of high-risk AI with bias assessment | 100% | (# high-risk AI with bias assessment / # high-risk AI) × 100 | Quarterly | Chief Data Officer | AI Governance Committee |
| KPI-DATA-004 | Fairness Metric Compliance | % of AI systems meeting fairness thresholds | 100% | (# AI systems meeting thresholds / # total AI systems) × 100 | Quarterly | Chief Data Officer | AI Governance Committee |
| KPI-DATA-005 | Lineage Completeness | % of datasets with complete lineage | 100% | (# datasets with complete lineage / # total datasets) × 100 | Monthly | Chief Data Officer | Management |
| KPI-DATA-006 | DPIA Completion | % of required DPIAs completed | 100% | (# DPIAs completed / # required DPIAs) × 100 | Quarterly | Data Protection Officer | AI Governance Committee |
| KPI-DATA-007 | Data Minimization Compliance | % of AI systems with minimization assessment | 100% | (# AI systems with assessment / # total AI systems) × 100 | Quarterly | Data Protection Officer | AI Governance Committee |
| KPI-DATA-008 | Privacy Protection Coverage | % of personal data with privacy protection | 100% | (# datasets with protection / # datasets with personal data) × 100 | Monthly | Data Protection Officer | Management |
| KPI-DATA-009 | Data Catalog Completeness | % of datasets in catalog with all mandatory fields | 100% | (# complete catalog entries / # total datasets) × 100 | Monthly | Chief Data Officer | Management |
| KPI-DATA-010 | Quality Issue Resolution Time | Average days to resolve quality issues | < 5 days | Σ (resolution date - issue date) / # issues | Monthly | Chief Data Officer | Management |
9.2 KPI Dashboards and Reporting
Real-Time Dashboard (Chief Data Officer access)
- Current data quality scores
- Bias assessment status
- Fairness metrics
- Lineage completeness
- DPIA status
- Privacy protection coverage
Monthly Management Report
- KPI-DATA-001, 002, 005, 008, 009, 010
- Trend analysis (vs. previous month)
- Issues and risks
- Planned actions
Quarterly AI Governance Committee Report
- All KPIs
- Bias assessment completion status
- Fairness compliance status
- DPIA completion status
- Internal audit findings (if conducted)
- Exception register review
Annual Executive Report
- Full-year KPI performance
- Data governance maturity assessment
- Strategic recommendations
- Regulatory outlook
9.3 KPI Thresholds and Alerts
| KPI | Green (Good) | Yellow (Warning) | Red (Critical) | Alert Action |
|---|---|---|---|---|
| Data Quality Requirements Coverage | 100% | 95-99% | < 95% | Red: Immediate escalation to AI Governance Committee Chair |
| Data Quality Threshold Compliance | ≥95% | 90-94% | < 90% | Red: Escalate to AI Governance Committee |
| Bias Assessment Coverage | 100% | 90-99% | < 90% | Red: Halt high-risk AI deployments until assessed |
| Fairness Metric Compliance | 100% | 90-99% | < 90% | Red: Block deployments until fairness achieved |
| DPIA Completion | 100% | 90-99% | < 90% | Red: Immediate escalation to DPO + Legal |
TRAINING REQUIREMENTS
10.1 Training Program Overview
All personnel involved in AI data governance must complete role-specific training to ensure competency in data quality management, bias detection, GDPR compliance, and data governance procedures.
10.2 Role-Based Training Requirements
| Role | Training Course | Duration | Content | Frequency | Assessment Required |
|---|---|---|---|---|---|
| Chief Data Officer | Data Governance Expert Training | 20 hours | EU AI Act Article 10; Data quality management; Bias detection; GDPR; Data lineage; Data catalog management | Initial + annually | Yes - Written exam (≥90%) + Practical data governance exercise |
| Data Scientists | Data Quality and Bias Assessment | 16 hours | Data quality dimensions; Bias types; Fairness metrics; Bias mitigation; Representativeness analysis | Initial + annually | Yes - Practical bias assessment exercise |
| Data Engineers | Data Lineage and Pipeline Management | 12 hours | Data lineage documentation; Data catalog; Privacy protection techniques; Pipeline implementation | Initial + annually | Yes - Practical lineage documentation exercise |
| Data Protection Officer | GDPR and Privacy for AI Data | 16 hours | GDPR Article 35 (DPIA); Data minimization; Privacy protection techniques; Consent management; Legal basis | Initial + annually | Yes - Written exam (≥90%) |
| AI System Owners | Data Governance Overview | 6 hours | Data quality requirements; Bias awareness; Data relevance; Responsibilities | At onboarding + annually | Yes - Knowledge check (≥80%) |
| All AI Development Staff | Data Quality Awareness | 2 hours | Data quality basics; Bias awareness; Data minimization | At onboarding + annually | Yes - Knowledge check (≥80%) |
10.3 Training Content by Topic
Data Quality Management
- Data quality dimensions (accuracy, completeness, consistency, etc.)
- Quality requirements definition
- Quality assessment methods
- Quality monitoring
- Quality remediation
Bias Detection and Mitigation
- Bias types (historical, representation, measurement, etc.)
- Protected characteristics
- Bias detection methods
- Bias mitigation strategies
- Fairness metrics
Data Relevance and Representativeness
- Relevance criteria
- Representativeness metrics
- Protected characteristics
- Underrepresentation identification
- Dataset appropriateness evaluation
Data Lineage and Provenance
- Lineage documentation
- Provenance verification
- Data catalog management
- Traceability
GDPR and Privacy
- DPIA requirements
- Data minimization
- Privacy protection techniques
- Consent management
- Legal basis
10.4 Training Delivery Methods
Initial Training:
- Instructor-led classroom or virtual training
- Includes interactive exercises and case studies
- Hands-on practice with data governance tools
- Group discussions of complex scenarios
Annual Refresher:
- E-learning modules for core content review
- Live update sessions for regulatory changes
- Case study reviews of recent data governance activities
- Knowledge assessment
On-the-Job Training:
- Mentoring for new data governance staff
- Job shadowing during data assessments
- Supervised data governance activities for first 5 AI systems
Just-in-Time Training:
- Quick reference guides and job aids
- Video tutorials on specific topics
- Help desk support from experienced data governance staff
10.5 Training Effectiveness Measurement
Assessment Methods:
- Written exams for knowledge retention
- Practical exercises for skill application
- On-the-job observations for competency validation
- Feedback surveys for training quality
Competency Validation:
- Chief Data Officers: Must correctly assess 5 sample datasets with 100% accuracy before independent assessment
- Data Scientists: Must demonstrate understanding of bias detection and fairness metrics
- All staff: Must pass knowledge assessments with minimum required scores
Training Metrics:
| Metric | Target | Frequency |
|---|---|---|
| Training completion rate | 100% | Quarterly |
| Assessment pass rate (first attempt) | ≥ 90% | Per training |
| Training effectiveness score (survey) | ≥ 4.0/5.0 | Per training |
| Time to competency (Chief Data Officers) | < 45 days | Per person |
10.6 Training Records
Records Maintained:
- Training attendance records
- Assessment scores
- Competency validations
- Refresher training completion
- Individual training transcripts
Retention: 10 years (to align with EU AI Act documentation retention)
Access: HR, Chief Data Officer, Internal Audit, Competent Authorities (upon request)
DEFINITIONS
| Term | Definition | Source |
|---|---|---|
| Training Data | Data used to train AI model to learn patterns | This Standard |
| Validation Data | Data used to tune model hyperparameters during training | This Standard |
| Testing Data | Data used to evaluate final model performance | This Standard |
| Data Quality | Degree to which data meets requirements for intended use | ISO/IEC 5259 series |
| Representativeness | Extent to which data reflects real-world population/scenarios | EU AI Act Article 10(3) |
| Bias | Systematic error or unfairness in data or model outputs | EU AI Act Article 10(2)(f) and (g) |
| Data Lineage | Complete history of data from origin to current state | This Standard |
| Special Category Data | Personal data revealing racial/ethnic origin, political opinions, religious beliefs, health, sex life, etc. | GDPR Article 9 |
| Data Provenance | Information about the origin, ownership, and history of data | This Standard |
| Data Minimization | Principle of collecting and processing only data necessary for purpose | GDPR Article 5(1)(c) |
| Anonymization | Process of removing all identifiers from data permanently | GDPR |
| Pseudonymization | Process of replacing identifiers with pseudonyms (reversible with key) | GDPR Article 4(5) |
| Fairness Metric | Quantitative measure of fairness across protected characteristics | This Standard |
| Demographic Parity | Equal positive prediction rate across groups | This Standard |
| Equal Opportunity | Equal true positive rate across groups | This Standard |
| DPIA | Data Protection Impact Assessment per GDPR Article 35 | GDPR Article 35 |
| Data Catalog | Centralized repository of metadata about all datasets | This Standard |
LINK WITH AI ACT AND ISO42001
12.1 EU AI Act Regulatory Mapping
This standard implements the following EU AI Act requirements:
| EU AI Act Provision | Article | Requirement Summary | Implemented By (Controls) |
|---|---|---|---|
| Data and Data Governance | Article 10 | Requirements for training, validation, testing data | All controls (DATA-001 through DATA-015) |
| Training, Validation, Testing Data Quality | Article 10(2) | Quality requirements for datasets | DATA-001, DATA-002, DATA-003 |
| Data Relevance | Article 10(3) | Datasets relevant and representative | DATA-004, DATA-005, DATA-006 |
| Data Examination for Bias | Article 10(2)(f) and (g) | Examine datasets for bias and apply measures to detect, prevent and mitigate biases | DATA-007, DATA-008, DATA-009 |
| Contextual Relevance | Article 10(4) | Datasets must account for characteristics particular to the specific geographical, contextual, behavioural or functional setting | DATA-004 |
| Special Category Data Processing | Article 10(5) | Requirements for processing special category data | DATA-013, DATA-014, DATA-015 |
12.2 GDPR Alignment
This standard aligns with GDPR requirements:
| GDPR Provision | Requirement | Implementation in This Standard |
|---|---|---|
| Article 5(1)(c): Data Minimization | Collect only necessary data | DATA-014 |
| Article 9: Special Category Data | Requirements for processing special category data | DATA-013, DATA-015 |
| Article 25: Data Protection by Design | Privacy protection techniques | DATA-015 |
| Article 35: DPIA | Data Protection Impact Assessment | DATA-013 |
| Article 30: Records of Processing | Documentation of data processing | DATA-010, DATA-012 |
12.3 ISO/IEC 42001:2023 Alignment
This standard aligns with ISO/IEC 42001:2023 as follows:
| ISO 42001 Clause | Requirement | Implementation in This Standard |
|---|---|---|
| Clause 6.1.2: AI system impact assessment | Assess data-related impacts | DATA-004, DATA-005, DATA-007 |
| Clause 7.5: Documented information | Maintain data documentation | DATA-010, DATA-012 |
| Clause 8.2: AI system risk assessment | Data-related risk assessment | DATA-007, DATA-008, DATA-009 |
| Clause 9.1: Monitoring and measurement | Monitor data quality | DATA-003 |
12.4 ISO/IEC 5259 Series Alignment
This standard aligns with ISO/IEC 5259 series (Data Quality for AI) as follows:
| ISO 5259 Standard | Requirement | Implementation in This Standard |
|---|---|---|
| ISO/IEC 5259-1: Overview | Data quality framework | DATA-001, DATA-002 |
| ISO/IEC 5259-2: Data quality measures | Quality metrics and measurement | DATA-002, DATA-003 |
| ISO/IEC 5259-3: Data quality management | Quality management process | DATA-001, DATA-002, DATA-003 |
12.5 Relationship to Other Standards
This data governance standard integrates with other AI Act standards:
| Related Standard | Integration Point | Rationale |
|---|---|---|
| STD-AI-002: Risk Management | Bias risk assessment (RM-006) uses data governance outputs | Data bias analysis feeds into risk assessment |
| STD-AI-004: Technical Documentation | Data requirements documented in technical documentation | Data governance outputs feed into Annex IV documentation |
| STD-AI-006: Transparency | Data limitations communicated to users | Data quality and bias limitations in transparency notices |
12.6 References and Related Documents
EU AI Act (Regulation (EU) 2024/1689):
- Article 10: Data and Data Governance
- Article 10(2): Training, Validation, Testing Data
- Article 10(3): Data Relevance
- Article 10(2)(f) and (g): Data Examination for Bias
- Article 10(4): Contextual Relevance of Datasets
- Article 10(5): Exceptional Processing of Special Category Data for Bias Correction
GDPR (Regulation (EU) 2016/679):
- Article 5(1)(c): Data Minimization
- Article 9: Special Category Data
- Article 25: Data Protection by Design
- Article 35: Data Protection Impact Assessment
ISO/IEC Standards:
- ISO/IEC 42001:2023: Information technology — Artificial intelligence — Management system
- ISO/IEC 5259 series: Information technology — Data quality for AI and machine learning
- ISO/IEC 23894:2023: Information technology — Artificial intelligence — Guidance on risk management
Internal Documents:
- POL-AI-001: Artificial Intelligence Policy (parent policy)
- STD-AI-002: AI Risk Management Standard
- PROC-AI-DATA-001 through PROC-AI-DATA-004: Data governance procedures
APPROVAL AND AUTHORIZATION
| Role | Name | Title | Signature | Date |
|---|---|---|---|---|
| Prepared By | Emily White | Chief Data Officer | _________________ | ________ |
| Reviewed By | Michael Brown | Chief Legal Officer | _________________ | ________ |
| Reviewed By | Lisa Anderson | Data Protection Officer | _________________ | ________ |
| Reviewed By | Jane Doe | Chief Strategy & Risk Officer | _________________ | ________ |
| Approved By | Jane Doe | AI Governance Committee Chair | _________________ | ________ |
Effective Date: 2025-08-01
Next Review Date: 2026-08-01
Review Frequency: Annually or upon regulatory change
END OF STANDARD STD-AI-003
This standard is a living document. Feedback and improvement suggestions should be directed to the Chief Data Officer.
Standard ID
STD-AI-003
Version
1.0
Status
draftOwner
Chief Data Officer
Effective Date
2025-08-01
Applicability
All AI systems, mandatory for high-risk