HIPsHanzo Proposals
Back to HIPs
HIP-230DraftMeta

AI Transparency & Explainability

Framework for ensuring transparency and explainability of Hanzo AI systems.

Hanzo AI Team (@hanzoai)
Created: 2025-12-17
ai-ethicstransparencyexplainabilitygovernance
Requires: HIP-200, HIP-201

HIP-230: AI Transparency & Explainability

Abstract

This HIP establishes the transparency and explainability framework for Hanzo AI systems. It defines requirements for communicating AI capabilities and limitations, providing explanations for AI outputs, and ensuring stakeholders can understand and audit AI behavior.

Transparency Framework

Transparency Principles

  1. Clarity: Information is understandable to intended audience
  2. Accessibility: Information is easy to find and access
  3. Completeness: Material information is disclosed
  4. Accuracy: Information is correct and up-to-date
  5. Proportionality: Level of disclosure matches risk level

Transparency Levels

LevelAudienceDepthExamples
PublicGeneral usersHigh-levelProduct pages, blog posts
UserActive usersFunctionalIn-product disclosures
DeveloperAPI usersTechnicalAPI documentation
AuditorReviewersDetailedModel cards, audit reports
RegulatorAuthoritiesComprehensiveRegulatory filings

AI Disclosure Requirements

System-Level Disclosure

AI Presence Disclosure

Requirement: Users must know when they're interacting with AI.

ContextDisclosure Method
Chat interfaceClear "AI" label
Voice interfaceAudio disclosure
Generated contentWatermark/label
Automated decisionsExplicit notice

Capability Disclosure

Requirement: Communicate what the AI can and cannot do.

ElementDisclosure
Intended usePrimary use cases
LimitationsKnown weaknesses
Not suitable forInappropriate uses
Accuracy expectationsPerformance levels

Model-Level Disclosure

Model Card Requirements

Every model must have a model card containing:

SectionRequired Contents
Model detailsName, version, type, architecture
Intended usePrimary uses, users, out-of-scope uses
Training dataData sources, composition, limitations
PerformanceBenchmark results, evaluation methodology
LimitationsKnown limitations, failure modes
Ethical considerationsBias, risks, mitigations

Training Data Disclosure

ElementDisclosure Level
Data sourcesNamed sources where possible
Data compositionCategories, proportions
Data processingFiltering, cleaning methods
Data limitationsKnown gaps, biases

Output-Level Disclosure

Generated Content Labeling

Content TypeLabeling Requirement
TextAI-generated indicator
ImagesWatermark + metadata
AudioAudio watermark + metadata
VideoVisual indicator + metadata

Confidence Disclosure

When appropriate, indicate confidence:

  • Uncertainty indicators for factual claims
  • Confidence scores for classifications
  • Probability ranges for predictions

Explainability Framework

Explainability Levels

LevelDescriptionAudience
FunctionalWhat the AI doesEnd users
BehavioralWhy the AI responded this wayUsers, developers
TechnicalHow the AI works internallyExperts, auditors

Explanation Types

Contrastive Explanations

"Why X instead of Y?"

Use CaseApproach
ClassificationWhy this class, not another
GenerationWhy this response, not alternative
RecommendationWhy this item, not others

Counterfactual Explanations

"What would change the outcome?"

Use CaseApproach
DecisionsWhat input changes would change result
RefusalsWhat would make request acceptable

Feature Attribution

"What influenced this output?"

Use CaseApproach
TextHighlight influential words/phrases
ImagesShow attention/saliency maps
StructuredShow feature importance

Explanation Implementation

User-Facing Explanations

ContextExplanation Type
Content refusalReason for refusal
Uncertain responseConfidence indication
Sourced claimsCitation/reference
RecommendationsRelevance factors

Developer-Facing Explanations

API FeaturePurpose
LogprobsToken probability information
Confidence scoresOutput certainty
Reasoning tracesChain-of-thought (where applicable)

Audit-Facing Explanations

ArtifactContents
Training logsTraining process documentation
Evaluation resultsDetailed benchmark performance
Decision logsSample decision explanations
Attention analysisModel attention patterns

Documentation Standards

Public Documentation

DocumentContentsUpdate Frequency
Product pageCapabilities, use casesAs features change
Help centerHow to use, limitationsContinuous
Blog/announcementsMajor updates, changesAs needed
Research papersTechnical detailsOn publication

Technical Documentation

DocumentContentsAudience
API documentationEndpoints, parameters, examplesDevelopers
Model cardModel details, performance, limitationsAll
System cardSystem-level informationAuditors
Safety documentationSafety measures, testingRegulators

Internal Documentation

DocumentContentsAccess
Training documentationData, process, decisionsInternal + audit
Risk assessmentsIdentified risks, mitigationsInternal
Incident reportsSafety incidents, responsesInternal + regulators

Audit & Verification

Internal Audit

Audit TypeFrequencyScope
Documentation reviewMonthlyAccuracy, completeness
Disclosure complianceQuarterlyAll disclosure requirements
Explanation qualityQuarterlyUser understanding

External Audit

Audit TypeFrequencyAuditor
Model auditAnnualThird-party ML experts
Documentation auditAnnualCompliance experts
User understanding studyBiennialResearch partners

Verification Methods

MethodPurpose
User surveysVerify understanding of disclosures
A/B testingTest explanation effectiveness
Expert reviewTechnical accuracy verification
Red teamAttempt to find undisclosed capabilities

Special Contexts

High-Stakes Decisions

When AI influences significant decisions:

RequirementImplementation
Explicit AI roleClear statement of AI's role
Human oversightIndication of human review
Appeal processHow to contest decisions
Detailed explanationFactors that influenced outcome

Synthetic Content

For AI-generated content:

RequirementImplementation
LabelingClear AI-generated indicator
WatermarkingTechnical watermark in media
ProvenanceMetadata about creation
Detection toolsTools to verify AI origin

Research & Development

During model development:

RequirementImplementation
Internal documentationTrack decisions, data, methods
ReproducibilityEnable result reproduction
Version controlTrack model versions
Change documentationDocument capability changes

Governance

Transparency Oversight

RoleResponsibility
Transparency LeadDay-to-day compliance
Communications TeamPublic-facing content
Legal TeamRegulatory compliance
ESG CommitteePolicy oversight

Review Process

ActivityFrequencyParticipants
Disclosure reviewMonthlyTransparency Lead
Documentation auditQuarterlyCross-functional
Policy reviewAnnualESG Committee

Escalation

IssueEscalation Path
Disclosure gapTransparency Lead → Product Lead
Misleading contentCommunications → Legal → ESG Committee
Regulatory concernLegal → Board

Related HIPs

  • HIP-200: Responsible AI Principles
  • HIP-201: Model Risk Management
  • HIP-210: Safety Evaluation Framework
  • HIP-220: Bias Detection & Mitigation
  • HIP-240: AI Incident Response
  • HIP-250: Sustainability Standards Alignment

Changelog

VersionDateChanges
1.02025-12-17Initial draft

Copyright

Copyright and related rights waived via CC0.