HIP-230 Draft Meta
AI Transparency & Explainability Framework for ensuring transparency and explainability of Hanzo AI systems.
ai-ethics transparency explainability governance
Requires: HIP-200, HIP-201
HIP-230: AI Transparency & Explainability
Abstract
This HIP establishes the transparency and explainability framework for Hanzo AI systems. It defines requirements for communicating AI capabilities and limitations, providing explanations for AI outputs, and ensuring stakeholders can understand and audit AI behavior.
Transparency Framework
Transparency Principles
Clarity : Information is understandable to intended audience
Accessibility : Information is easy to find and access
Completeness : Material information is disclosed
Accuracy : Information is correct and up-to-date
Proportionality : Level of disclosure matches risk level
Transparency Levels
Level Audience Depth Examples Public General users High-level Product pages, blog posts User Active users Functional In-product disclosures Developer API users Technical API documentation Auditor Reviewers Detailed Model cards, audit reports Regulator Authorities Comprehensive Regulatory filings
AI Disclosure Requirements
System-Level Disclosure
AI Presence Disclosure
Requirement : Users must know when they're interacting with AI.
Context Disclosure Method Chat interface Clear "AI" label Voice interface Audio disclosure Generated content Watermark/label Automated decisions Explicit notice
Capability Disclosure
Requirement : Communicate what the AI can and cannot do.
Element Disclosure Intended use Primary use cases Limitations Known weaknesses Not suitable for Inappropriate uses Accuracy expectations Performance levels
Model-Level Disclosure
Model Card Requirements
Every model must have a model card containing:
Section Required Contents Model details Name, version, type, architecture Intended use Primary uses, users, out-of-scope uses Training data Data sources, composition, limitations Performance Benchmark results, evaluation methodology Limitations Known limitations, failure modes Ethical considerations Bias, risks, mitigations
Training Data Disclosure
Element Disclosure Level Data sources Named sources where possible Data composition Categories, proportions Data processing Filtering, cleaning methods Data limitations Known gaps, biases
Output-Level Disclosure
Generated Content Labeling
Content Type Labeling Requirement Text AI-generated indicator Images Watermark + metadata Audio Audio watermark + metadata Video Visual indicator + metadata
Confidence Disclosure
When appropriate, indicate confidence:
Uncertainty indicators for factual claims
Confidence scores for classifications
Probability ranges for predictions
Explainability Framework
Explainability Levels
Level Description Audience Functional What the AI does End users Behavioral Why the AI responded this way Users, developers Technical How the AI works internally Experts, auditors
Explanation Types
Contrastive Explanations
"Why X instead of Y?"
Use Case Approach Classification Why this class, not another Generation Why this response, not alternative Recommendation Why this item, not others
Counterfactual Explanations
"What would change the outcome?"
Use Case Approach Decisions What input changes would change result Refusals What would make request acceptable
Feature Attribution
"What influenced this output?"
Use Case Approach Text Highlight influential words/phrases Images Show attention/saliency maps Structured Show feature importance
Explanation Implementation
User-Facing Explanations
Context Explanation Type Content refusal Reason for refusal Uncertain response Confidence indication Sourced claims Citation/reference Recommendations Relevance factors
Developer-Facing Explanations
API Feature Purpose Logprobs Token probability information Confidence scores Output certainty Reasoning traces Chain-of-thought (where applicable)
Audit-Facing Explanations
Artifact Contents Training logs Training process documentation Evaluation results Detailed benchmark performance Decision logs Sample decision explanations Attention analysis Model attention patterns
Documentation Standards
Public Documentation
Document Contents Update Frequency Product page Capabilities, use cases As features change Help center How to use, limitations Continuous Blog/announcements Major updates, changes As needed Research papers Technical details On publication
Technical Documentation
Document Contents Audience API documentation Endpoints, parameters, examples Developers Model card Model details, performance, limitations All System card System-level information Auditors Safety documentation Safety measures, testing Regulators
Internal Documentation
Document Contents Access Training documentation Data, process, decisions Internal + audit Risk assessments Identified risks, mitigations Internal Incident reports Safety incidents, responses Internal + regulators
Audit & Verification
Internal Audit
Audit Type Frequency Scope Documentation review Monthly Accuracy, completeness Disclosure compliance Quarterly All disclosure requirements Explanation quality Quarterly User understanding
External Audit
Audit Type Frequency Auditor Model audit Annual Third-party ML experts Documentation audit Annual Compliance experts User understanding study Biennial Research partners
Verification Methods
Method Purpose User surveys Verify understanding of disclosures A/B testing Test explanation effectiveness Expert review Technical accuracy verification Red team Attempt to find undisclosed capabilities
Special Contexts
High-Stakes Decisions
When AI influences significant decisions:
Requirement Implementation Explicit AI role Clear statement of AI's role Human oversight Indication of human review Appeal process How to contest decisions Detailed explanation Factors that influenced outcome
Synthetic Content
For AI-generated content:
Requirement Implementation Labeling Clear AI-generated indicator Watermarking Technical watermark in media Provenance Metadata about creation Detection tools Tools to verify AI origin
Research & Development
During model development:
Requirement Implementation Internal documentation Track decisions, data, methods Reproducibility Enable result reproduction Version control Track model versions Change documentation Document capability changes
Governance
Transparency Oversight
Role Responsibility Transparency Lead Day-to-day compliance Communications Team Public-facing content Legal Team Regulatory compliance ESG Committee Policy oversight
Review Process
Activity Frequency Participants Disclosure review Monthly Transparency Lead Documentation audit Quarterly Cross-functional Policy review Annual ESG Committee
Escalation
Issue Escalation Path Disclosure gap Transparency Lead → Product Lead Misleading content Communications → Legal → ESG Committee Regulatory concern Legal → Board
Related HIPs
HIP-200 : Responsible AI Principles
HIP-201 : Model Risk Management
HIP-210 : Safety Evaluation Framework
HIP-220 : Bias Detection & Mitigation
HIP-240 : AI Incident Response
HIP-250 : Sustainability Standards Alignment
Changelog
Version Date Changes 1.0 2025-12-17 Initial draft
Copyright
Copyright and related rights waived via CC0 .