HIP-201: Model Risk Management
Abstract
This HIP establishes the Model Risk Management (MRM) framework for Hanzo AI systems. It defines processes for identifying, assessing, monitoring, and mitigating risks throughout the AI model lifecycle, aligned with NIST AI RMF and SR 11-7 principles.
Scope
Covered Models
Model Type Risk Tier MRM Requirements Foundation models High Full MRM Fine-tuned models Medium-High Standard MRM Customer-deployed models Varies Risk-based MRM Internal tools Low Simplified MRM
Model Lifecycle Coverage
Design → Development → Validation → Deployment → Monitoring → Retirement
↑ ↑ ↑ ↑ ↑ ↑
MRM MRM MRM MRM MRM MRM
Risk Categories
Performance Risk
Risk Description Impact Accuracy degradation Model performance below thresholds User harm, business impact Distribution shift Training/production data mismatch Unreliable outputs Capability limitations Model cannot perform required tasks Unmet expectations
Safety Risk
Risk Description Impact Harmful outputs Generation of dangerous content User harm, reputation Jailbreaking Circumvention of safety measures Policy violations Misuse potential Use for malicious purposes Societal harm
Fairness Risk
Risk Description Impact Demographic bias Disparate performance across groups Discrimination Representation bias Underrepresentation in training data Exclusion Outcome bias Systematically unfair results Legal, ethical issues
Security Risk
Risk Description Impact Adversarial attacks Manipulated inputs causing errors Reliability Data poisoning Corrupted training data Model integrity Model theft Unauthorized model extraction IP loss Privacy leakage Training data exposure Data breach
Operational Risk
Risk Description Impact Availability failures Model downtime Service disruption Latency issues Response time degradation User experience Scalability limits Cannot handle load Capacity constraints Integration failures API/system incompatibility Technical debt
Risk Assessment
Risk Scoring
Likelihood Scale
Score Likelihood Description 1 Rare <5% probability 2 Unlikely 5-20% probability 3 Possible 20-50% probability 4 Likely 50-80% probability 5 Almost certain >80% probability
Impact Scale
Score Impact Description 1 Minimal Negligible harm, easily corrected 2 Minor Limited harm, manageable impact 3 Moderate Significant harm, notable impact 4 Major Serious harm, substantial impact 5 Severe Catastrophic harm, irreversible
Risk Matrix
Minimal Minor Moderate Major Severe Almost certain Medium High High Critical Critical Likely Low Medium High High Critical Possible Low Medium Medium High High Unlikely Low Low Medium Medium High Rare Low Low Low Medium Medium
Risk Tiering
Tier Criteria Requirements Critical Life-safety, large-scale harm Board approval, external review High Significant harm potential ESG Committee review Medium Moderate harm potential MRM Team review Low Limited harm potential Standard processes
MRM Processes
Pre-Development
Risk Assessment
Before model development:
Define intended use cases
Identify potential misuse scenarios
Assess risk tier
Document risk appetite
Define success criteria
Documentation Requirements
Document Contents Model proposal Use case, architecture, data sources Risk assessment Initial risk identification and scoring Validation plan Testing approach and criteria Monitoring plan Ongoing oversight requirements
Development Phase
Risk Controls
Control Implementation Data governance Quality checks, bias audits Training safeguards Alignment techniques, guardrails Version control Model versioning, reproducibility Access controls Role-based development access
Checkpoints
Checkpoint Timing Requirements Design review Before training Architecture approval Data review Before training Data quality sign-off Training review During training Progress monitoring Pre-validation After training Initial quality check
Validation Phase
Independent Validation
Requirements :
Validation team independent from development
Documented test methodology
Representative test data
Clear pass/fail criteria
Validation Scope :
Area Tests Performance Accuracy, robustness, calibration Safety Red teaming, harm testing Fairness Bias audits, demographic analysis Security Adversarial testing, privacy checks
Validation Report
Contents :
Methodology description
Test results by category
Identified issues and severity
Recommendations
Approval decision
Deployment Phase
Pre-Deployment Checklist
Item Verification ☐ Validation complete Validation report approved ☐ Documentation complete Model card, API docs ☐ Monitoring configured Dashboards, alerts ☐ Rollback plan Tested rollback procedure ☐ Approval obtained Appropriate sign-off
Staged Rollout
Stage Exposure Duration Exit Criteria Alpha Internal only 1 week No critical issues Beta 1% traffic 1 week Metrics stable Gradual 10% → 50% → 100% 2 weeks Full monitoring
Monitoring Phase
Continuous Monitoring
Metric Frequency Threshold Performance Real-time <X% degradation Safety Daily <Y incidents Fairness Weekly <Z% disparity Availability Real-time >99.9% uptime
Periodic Review
Review Frequency Scope Performance review Monthly Metrics, trends Risk review Quarterly Risk reassessment Full validation Annual Comprehensive revalidation
Retirement Phase
Retirement Triggers
Performance below acceptable thresholds
Superseded by better model
Unmitigable risk identified
Business decision
Retirement Process
Notification to stakeholders
Migration plan for users
Grace period (typically 90 days)
Archive model and documentation
Update model inventory
Governance
MRM Organization
Role Responsibility MRM Team Day-to-day risk management Model Validators Independent validation AI Safety Team Safety-specific risks ESG Committee Oversight, policy
Approval Authority
Risk Tier Approver Critical Board High ESG Committee Medium MRM Lead Low Team Lead
Model Inventory
Maintain comprehensive inventory:
Model identifier and version
Risk tier and assessment
Deployment status
Validation status
Owner and contacts
Documentation Standards
Model Card (Required)
Section Contents Model details Architecture, training data Intended use Use cases, out-of-scope uses Performance Metrics, benchmarks Limitations Known limitations, risks Ethical considerations Bias, fairness, safety
Risk Register
For each model:
Identified risks
Risk scores (likelihood × impact)
Control measures
Residual risk
Risk owner
Audit Trail
Maintain records of:
All risk assessments
Validation results
Approval decisions
Incidents and responses
Changes and rationale
Related HIPs
HIP-200 : Responsible AI Principles
HIP-210 : Safety Evaluation Framework
HIP-220 : Bias Detection & Mitigation
HIP-230 : AI Transparency & Explainability
HIP-240 : AI Incident Response
HIP-250 : Sustainability Standards Alignment
Changelog
Version Date Changes 1.0 2025-12-17 Initial draft
Copyright
Copyright and related rights waived via CC0 .