---
name: it-risk-management
description: Manage IT risks with risk assessments, risk registers, threat modeling, business impact analysis, risk treatment plans, risk reporting, and compliance alignment. Use when conducting IT risk assessments, maintaining risk registers, performing threat modeling, analyzing business impact, creating risk treatment plans, reporting risk posture, or aligning risks with compliance requirements. Triggers on phrases like "IT risk management", "risk assessment", "risk register", "threat modeling", "business impact analysis", "BIA", "risk treatment", "risk reporting", "risk posture", "enterprise risk", "IT risk", "risk framework", "risk quantification", "risk appetite", "residual risk", "inherent risk".
---

# IT Risk Management

Systematically identify, assess, treat, and monitor IT risks across the organization.

## Workflow

1. Establish risk framework: risk taxonomy, assessment methodology, risk appetite statement, reporting cadence.
2. Identify risks: asset inventory, threat modeling, vulnerability assessment, dependency mapping, threat intelligence.
3. Assess risks: likelihood × impact scoring, qualitative and quantitative analysis, business impact analysis.
4. Treat risks: accept, mitigate, transfer, or avoid — with treatment plans, owners, timelines.
5. Monitor risks: continuous monitoring, key risk indicators, risk trend analysis, control effectiveness testing.
6. Report risks: executive dashboards, board reports, audit findings, regulatory disclosures.
7. Review and improve: quarterly risk review, annual risk assessment, lessons learned from incidents.

## Risk Framework

### Enterprise IT Risk Architecture

```
IT RISK MANAGEMENT — ENTERPRISE FRAMEWORK
===========================================

Risk Framework: ISO 31000 + NIST RMF (SP 800-37) + FAIR (Factor Analysis of Information Risk)
Risk Governance: CISO (ownership) + Risk Committee (quarterly review) + Board (annual report)
Risk Assessment Frequency: Continuous monitoring + Quarterly assessment + Annual comprehensive review
Risk Register: 148 active risks (across all IT domains)

RISK TAXONOMY:
  ┌───────────────────────────────────────┬────────────┬────────────────────┬──────────────────────┐
  │ Risk Category                        │ Risks      │ % of Total         │ Avg Severity         │
  ├───────────────────────────────────────┼────────────┼────────────────────┼──────────────────────┤
  │ Cybersecurity Threats                │ 34         │ 23.0%              │ HIGH                 │
  │ Data Privacy & Compliance            │ 22         │ 14.9%              │ HIGH                 │
  │ Infrastructure & Operations          │ 28         │ 18.9%              │ MEDIUM               │
  │ Application Security                 │ 18         │ 12.2%              │ HIGH                 │
  │ Cloud & Third-Party Risk             │ 16         │ 10.8%              │ MEDIUM               │
  │ Business Continuity & DR             │ 12         │ 8.1%               │ MEDIUM               │
  │ Identity & Access Management         │ 8          │ 5.4%               │ HIGH                 │
  │ Supply Chain & Vendor Risk           │ 6          │ 4.1%               │ MEDIUM               │
  │ Emerging Technology Risk             │ 4          │ 2.7%               │ LOW                  │
  │ People & Culture                     │ 4          │ 2.7%               │ LOW                  │
  │ Financial & Cost Risk                │ 2          │ 1.4%               │ LOW                  │
  │ Regulatory & Legal                   │ 4          │ 2.7%               │ MEDIUM               │
  └───────────────────────────────────────┴────────────┴────────────────────┴──────────────────────┘
  Total active risks: 148 | New risks added (last quarter): 12 | Risks closed: 8

RISK SCORING METHODOLOGY:
  Inherent Risk = Likelihood × Impact (before controls)
  Residual Risk = Inherent Risk × (1 - Control Effectiveness) (after controls)
  
  Likelihood Scale (1-5):
    1 — Rare (< 1% annual probability, once every 10+ years)
    2 — Unlikely (1-5% annual probability, once every 5-10 years)
    3 — Possible (5-20% annual probability, once every 2-5 years)
    4 — Likely (20-50% annual probability, once every 1-2 years)
    5 — Almost Certain (> 50% annual probability, multiple times per year)
  
  Impact Scale (1-5):
    1 — Negligible (< $10K financial, < 1 hour downtime, no customer impact)
    2 — Minor ($10K-$50K, 1-4 hours downtime, limited customer impact)
    3 — Moderate ($50K-$250K, 4-12 hours downtime, significant customer impact)
    4 — Major ($250K-$1M, 12-48 hours downtime, widespread customer impact)
    5 — Catastrophic (> $1M, > 48 hours downtime, regulatory/legal consequences)
  
  Risk Matrix:
              IMPACT
              1     2     3     4     5
      5  |    L     M     H     CH    CH   |
      4  |    L     M     M     H     CH   |
 LIKELIHOOD  3  |    L     L     M     H     H    |
      2  |    L     L     L     M     H    |
      1  |    L     L     L     L     M    |
  
      L = LOW | M = MEDIUM | H = HIGH | CH = CRITICAL

RISK REGISTER SUMMARY:
  By Severity (Residual Risk):
    Critical: 4 risks (2.7%) — require immediate treatment, executive attention
    High: 18 risks (12.2%) — treatment within 30 days
    Medium: 56 risks (37.8%) — treatment within 90 days
    Low: 70 risks (47.3%) — treatment within 180 days or accept
  
  By Treatment Status:
    Open (no treatment started): 22 risks (14.9%) — new or inherited
    In Progress (treatment underway): 48 risks (32.4%) — active remediation
    Monitoring (controls in place, monitoring effectiveness): 58 risks (39.2%)
    Accepted (risk accepted by management): 14 risks (9.5%) — within risk appetite
    Closed (treatment completed, verified): 6 risks (4.1%) — resolved this quarter

RISK APPETITE STATEMENT:
  The organization's risk appetite for IT risks:
    Cybersecurity: LOW appetite (zero tolerance for data breaches, unauthorized access)
    Compliance: LOW appetite (zero tolerance for regulatory violations)
    Availability: MEDIUM appetite (accept brief, infrequent outages if they enable innovation)
    Innovation: HIGH appetite (willing to accept moderate risks for competitive advantage)
    Financial: LOW appetite (strict budget controls, no unexpected costs > $50K)
    Reputation: LOW appetite (protect brand, minimize customer-facing incidents)
```

### Critical Risk Details

```
CRITICAL RISKS — DETAILED ANALYSIS
====================================

RISK #1: Ransomware Attack on Production Environment
  ┌────────────────────────────────────────────────────────────────────────────┐
  │ INHERENT RISK: CRITICAL (Likelihood: 3 × Impact: 5 = 15)                  │
  │ RESIDUAL RISK: HIGH (Likelihood: 2 × Impact: 5 × Control Effectiveness: 60%)│
  │ ANNUALIZED LOSS EXPOSURE (ALE): $420,000                                  │
  │ RISK OWNER: CISO                                                          │
  │ REVIEW DATE: Monthly                                                      │
  └────────────────────────────────────────────────────────────────────────────┘
  
  Threat Description:
    Sophisticated ransomware attack targeting production infrastructure, encrypting
    databases, file servers, and backup systems. Attacker demands $2M+ ransom within 72 hours.
  
  Attack Vectors:
    1. Phishing email with malicious attachment (most common — 45% of breaches)
    2. Exploited vulnerability in internet-facing application (30%)
    3. Compromised vendor credentials (15%)
    4. Insider threat (malicious or negligent) (5%)
    5. Supply chain compromise (5%)
  
  Impact Analysis:
    Financial: $420,000 estimated loss (downtime: $85K/day × 5 days = $425K + ransom: $2M if paid)
    Operational: Complete production outage (all services down, estimated 5-10 days to recover from backup)
    Customer: 100% of customers affected (no access to services, potential data loss)
    Regulatory: GDPR notification required within 72 hours (if PII encrypted)
    Reputation: Severe brand damage (media coverage, customer trust loss, stock impact)
  
  Existing Controls:
    1. Endpoint Detection & Response (EDR): CrowdStrike Falcon — 95% effective
    2. Network segmentation: Production isolated from user network — 80% effective
    3. Backup strategy: 3-2-1 rule (3 copies, 2 media types, 1 offsite) — 70% effective
       (offsite backup tested quarterly, last test: September 2024, successful)
    4. Email filtering: Proofpoint — 90% effective (blocks phishing before endpoint)
    5. Patch management: Critical patches within 7 days — 85% effective
    6. User awareness training: Quarterly phishing simulations — 75% effective
  
  Treatment Plan:
    1. IMMEDIATE (0-30 days):
       [✓] Deploy EDR to all endpoints (completed — 100% coverage)
       [✓] Enable immutable backups (WORM storage for critical data)
       [ ] Test ransomware recovery drill (scheduled January 2025)
       [ ] Enable network micro-segmentation (production database tier)
    
    2. SHORT-TERM (30-90 days):
       [ ] Deploy email DMARC/DKIM/SPF (reduce phishing effectiveness)
       [ ] Implement privileged access management (PAM) — just-in-time access
       [ ] Enable backup encryption (prevent backup encryption by attacker)
       [ ] Conduct red team exercise (simulated ransomware attack)
    
    3. LONG-TERM (90-180 days):
       [ ] Deploy zero-trust network architecture (eliminate lateral movement)
       [ ] Implement data loss prevention (DLP) — prevent data exfiltration
       [ ] Establish cyber insurance coverage ($10M policy)
       [ ] Deploy AI-based threat hunting (proactive detection)
    
    Estimated treatment cost: $280,000 (reduces ALE from $420K to $120K — 71% reduction)

RISK #2: Cloud Provider Regional Outage
  ┌────────────────────────────────────────────────────────────────────────────┐
  │ INHERENT RISK: HIGH (Likelihood: 2 × Impact: 5 = 10)                      │
  │ RESIDUAL RISK: MEDIUM (Likelihood: 2 × Impact: 5 × Control Effectiveness: 50%)│
  │ ANNUALIZED LOSS EXPOSURE (ALE): $150,000                                  │
  │ RISK OWNER: VP of Infrastructure                                            │
  │ REVIEW DATE: Quarterly                                                      │
  └────────────────────────────────────────────────────────────────────────────┘
  
  Threat Description:
    AWS us-east-1 regional outage (complete or partial) affecting all services
    deployed in single region. Duration: 2-24 hours (based on AWS historical outages).
  
  Impact Analysis:
    Financial: $150,000 estimated loss ($85K/day × partial day + SLA credits)
    Operational: 100% service unavailability (single region, no failover)
    Customer: All customers affected during outage window
    SLA: Breach of 99.9% availability commitment (if outage > 43 minutes/month)
  
  Existing Controls:
    1. Multi-AZ deployment: Services deployed across 3 availability zones — 80% effective
       (protects against AZ failure, not regional failure)
    2. CDN caching: CloudFront caches static content at edge — 60% effective
       (keeps read-only content available during origin outage)
    3. Auto-scaling: Handle partial capacity loss within region — 70% effective
    4. AWS Service Health Dashboard monitoring: 50% effective (awareness only, no mitigation)
  
  Treatment Plan:
    1. IMMEDIATE (0-30 days):
       [ ] Document multi-region failover runbook (step-by-step, tested)
       [ ] Configure Route 53 failover routing (DNS-based failover)
    
    2. SHORT-TERM (30-90 days):
       [ ] Deploy active-passive in secondary region (us-west-2)
       [ ] Test cross-region database replication (RDS Global Database)
       [ ] Conduct multi-region failover drill (simulate regional outage)
    
    3. LONG-TERM (90-180 days):
       [ ] Deploy active-active multi-region architecture
       [ ] Implement global load balancing (latency-based routing)
       [ ] Cross-region data synchronization (real-time, not async)
    
    Estimated treatment cost: $450,000 (active-active architecture)
    Expected ROI: Eliminates regional outage risk, reduces ALE from $150K to $15K (90% reduction)

RISK #3: Critical Data Breach (PII Exposure)
  ┌────────────────────────────────────────────────────────────────────────────┐
  │ INHERENT RISK: CRITICAL (Likelihood: 3 × Impact: 5 = 15)                  │
  │ RESIDUAL RISK: HIGH (Likelihood: 2 × Impact: 5 × Control Effectiveness: 55%)│
  │ ANNUALIZED LOSS EXPOSURE (ALE): $580,000                                  │
  │ RISK OWNER: CISO + Data Protection Officer                                 │
  │ REVIEW DATE: Monthly                                                      │
  └────────────────────────────────────────────────────────────────────────────┘
  
  Threat Description:
    Unauthorized access to customer database containing PII (names, emails, addresses,
    payment tokens). Data exfiltrated and published on dark web. 500,000+ customer records exposed.
  
  Impact Analysis:
    Financial: $580,000 estimated loss
      - Regulatory fines: $200,000 (GDPR: up to 4% of global turnover or €20M)
      - Customer notification: $150,000 (credit monitoring, legal, PR, call center)
      - Business loss: $180,000 (churn, reputational damage, litigation)
      - Forensic investigation: $50,000 (third-party incident response)
    Regulatory: GDPR notification within 72 hours, potential ICO investigation
    Customer: 500,000+ customers notified, mandatory credit monitoring for 2 years
    Reputation: Severe — media coverage, customer trust erosion, potential class-action lawsuit
  
  Existing Controls:
    1. Database encryption: AES-256 at rest — 85% effective
    2. Network segmentation: Database tier isolated — 80% effective
    3. Access controls: Role-based access, MFA required — 75% effective
    4. DLP: Data Loss Prevention monitoring — 60% effective
    5. Audit logging: All database access logged — 70% effective (detection, not prevention)
    6. WAF: Web Application Firewall — 65% effective (blocks common injection attacks)
  
  Treatment Plan:
    1. IMMEDIATE (0-30 days):
       [ ] Enable database activity monitoring (real-time alerting on unusual queries)
       [ ] Implement tokenization for sensitive fields (payment data, SSN)
    
    2. SHORT-TERM (30-90 days):
       [ ] Deploy data classification and labeling (identify all PII locations)
       [ ] Implement database firewall (block unauthorized query patterns)
       [ ] Enable query result size limits (prevent bulk data exfiltration)
    
    3. LONG-TERM (90-180 days):
       [ ] Deploy data loss prevention (DLP) with machine learning
       [ ] Implement zero-trust database access (just-in-time, just-enough privileges)
       [ ] Conduct data breach simulation (tabletop exercise with leadership)
       [ ] Establish data breach response team (pre-designated roles, contact list)
    
    Estimated treatment cost: $320,000 (reduces ALE from $580K to $180K — 69% reduction)
```

### Business Impact Analysis

```
BUSINESS IMPACT ANALYSIS (BIA) — CRITICAL SYSTEMS
===================================================

BIA Methodology: Qualitative + Quantitative analysis, stakeholder interviews, historical data review
Last BIA Completed: October 2024 | Next BIA: April 2025 | BIA Owner: IT Risk Manager

CRITICAL BUSINESS FUNCTIONS:
  ┌────────────────────────────────────┬────────────┬────────────┬──────────────────┬──────────────────┐
  │ Business Function                  │ RTO (hrs)  │ RPO (hrs)  │ Max Tolerable    │ Financial Impact │
  │                                    │            │            │ Downtime (MTD)   │ Per Hour         │
  ├────────────────────────────────────┼────────────┼────────────┼──────────────────┼──────────────────┤
  │ E-commerce platform (order processing)│ 1        │ 0.5        │ 4 hours          │ $25,000          │
  │ Payment processing                 │ 0.5       │ 0          │ 2 hours          │ $40,000          │
  │ Customer authentication            │ 1         │ 0.5        │ 2 hours          │ $15,000          │
  │ Inventory management               │ 4         │ 1          │ 8 hours          │ $8,000           │
  │ Customer data (CRM)                │ 4         │ 1          │ 8 hours          │ $5,000           │
  │ Email/notification service         │ 8         │ 4          │ 24 hours         │ $3,000           │
  │ Analytics/reporting                │ 24        │ 12         │ 48 hours         │ $2,000           │
  │ Internal tools (HR, finance)       │ 8         │ 4          │ 24 hours         │ $4,000           │
  │ Developer environment              │ 24        │ 24         │ 72 hours         │ $1,000           │
  │ Documentation/wiki                 │ 48        │ 24         │ 72 hours         │ $500             │
  └────────────────────────────────────┴────────────┴────────────┴──────────────────┴──────────────────┘

  RTO = Recovery Time Objective (how quickly must system be restored)
  RPO = Recovery Point Objective (how much data loss is acceptable)
  MTD = Maximum Tolerable Downtime (absolute maximum before irreversible damage)

DEPENDENCY MAPPING (Critical Systems):
  E-commerce Platform depends on:
    → Payment Processing (hard dependency — cannot process orders without payment)
    → Customer Authentication (hard dependency — cannot identify customer)
    → Inventory Management (soft dependency — can process with cached inventory, refresh later)
    → Email Service (soft dependency — can queue emails, send when available)
  
  Payment Processing depends on:
    → Customer Authentication (hard dependency — must verify user)
    → Database (hard dependency — transaction records)
    → External Payment Gateway (hard dependency — Stripe/Adyen, out of our control)
  
  Customer Authentication depends on:
    → Database (hard dependency — user credentials)
    → Redis Cache (soft dependency — session store, can fall back to DB)
    → MFA Provider (soft dependency — can allow password-only if MFA down, with risk)

BUSINESS CONTINUITY PRIORITY:
  Tier 1 (Restore within 1 hour): Payment Processing, Customer Authentication
  Tier 2 (Restore within 4 hours): E-commerce Platform, Inventory Management, CRM
  Tier 3 (Restore within 8 hours): Email/Notification, Internal Tools
  Tier 4 (Restore within 24 hours): Analytics, Developer Environment
  Tier 5 (Restore within 48 hours): Documentation, Non-critical systems

DISASTER RECOVERY ALIGNMENT:
  Systems meeting RTO/RPO targets: 7/10 (70%)
  Systems NOT meeting targets:
    1. Payment Processing: Current RTO = 2 hours (target: 0.5 hours)
       Gap: 1.5 hours | Action: Implement active-active failover (ETA: 3 months)
    
    2. Customer Authentication: Current RPO = 1 hour (target: 0.5 hours)
       Gap: 0.5 hours | Action: Implement real-time database replication (ETA: 2 months)
    
    3. Email Service: Current RTO = 24 hours (target: 8 hours)
       Gap: 16 hours | Action: Deploy redundant email service in second region (ETA: 1 month)
```

## Integration Points

- Risk management: RiskLens (FAIR), MetricNet, LogicManager, ServiceNow GRC, RSA Archer, Diligent
- Threat modeling: Threat Dragon, OWASP Threat Dragon, Microsoft Threat Modeling Tool, IriusRisk
- Vulnerability management: Qualys, Tenable, Rapid7, Nessus, OpenVAS
- Business impact analysis: RiskMethods, Master the Metrics, custom BIA tools
- Incident response: TheHive, Cortex XSOAR, Splunk Phantom, PagerDuty, Jira Service Management
- Compliance: Vanta, Drata, Secureframe, Sprinto, Laika
- Asset management: ServiceNow CMDB, Lansweeper, Snipe-IT, SolarWinds IPAM
- SIEM: Splunk, Azure Sentinel, IBM QRadar, Sumo Logic, Datadog SIEM
- Threat intelligence: MISP, AlienVault OTX, VirusTotal, Recorded Future, Crowdstrike Intel
- Reporting: Tableau, Power BI, Grafana, custom dashboards
- Communication: Slack, Microsoft Teams, email (risk notifications, board reports)
- Insurance: Cyber insurance providers (Aon, Chubb, Axis, Hiscox) — for risk transfer

## Edge Cases

- **Risk scoring subjectivity**: Two risk assessors score the same risk differently (one says HIGH, another says MEDIUM). Resolution: (1) standardized scoring rubric (clear definitions for each likelihood/impact level), (2) calibration sessions (assessors score same sample risks, compare, align), (3) quantitative analysis where possible (FAIR model — dollar-based, not subjective), (4) consensus process (disputed scores reviewed by Risk Committee), (5) historical validation (compare risk scores to actual incidents, adjust scoring).

- **Risk treatment budget exhausted**: 148 risks identified, but budget only covers top 20 treatments. Resolution: (1) prioritize by ALE (Annualized Loss Exposure — treat highest financial exposure first), (2) group treatments (one control may address multiple risks — efficiency), (3) risk acceptance (formal acceptance for low-priority risks, documented by management), (4) phased treatment plan (year 1: critical/high, year 2: medium, year 3: low), (5) risk transfer (cyber insurance for residual risk, reduces need for treatment).

- **Third-party risk discovery**: Critical vendor has poor security posture discovered after contract signed. Resolution: (1) immediate assessment (scope of risk — what data do they access, what services depend on them), (2) risk mitigation request (vendor must improve security within 90 days, or contract terminated), (3) compensating controls (reduce data shared, add monitoring, limit access), (4) contingency planning (identify alternative vendor, prepare migration plan), (5) contract amendment (add security requirements, audit rights, termination clauses).

- **Risk acceptance challenged by audit**: Management accepted a risk, but external auditor disagrees (considers it unacceptable). Resolution: (1) documented risk acceptance (management signed off, rationale documented), (2) compensating controls evidence (show alternative controls address risk), (3) risk treatment timeline (show plan to address risk within defined period), (4) auditor discussion (explain risk appetite, provide context), (5) escalate if unresolved (Risk Committee decides, Board if needed).

- **Emerging risk not in taxonomy**: New threat vector (e.g., AI-powered attacks, quantum computing risk) doesn't fit existing risk categories. Resolution: (1) risk taxonomy review (quarterly, add new categories as needed), (2) threat intelligence feed (subscribe to emerging threat reports), (3) red team assessment (proactively test for new attack vectors), (4) scenario planning (tabletop exercises for emerging threats), (5) expert consultation (external security advisors, industry forums).

- **Risk fatigue**: Too many risks in register, teams overwhelmed, risk management becomes checkbox exercise. Resolution: (1) risk consolidation (combine related risks, reduce register size), (2) risk aging (close risks that are no longer relevant, reduce clutter), (3) risk prioritization (focus on top 20% of risks that drive 80% of exposure), (4) risk automation (automated risk scoring, continuous monitoring — reduce manual effort), (5) risk culture (embed risk thinking in daily work, not separate process).

- **Inherited risk from acquisition**: Acquired company has different (poorer) security posture, introducing new risks. Resolution: (1) pre-acquisition due diligence (security assessment before signing), (2) post-acquisition assessment (comprehensive risk inventory within 30 days), (3) risk remediation plan (90-day plan to bring acquired systems to corporate standard), (4) interim controls (network segmentation, access review, monitoring), (5) integration timeline (6-month plan for full security integration).

- **Regulatory change creating new risks**: New regulation (e.g., EU AI Act, state privacy laws) creates compliance risk for existing systems. Resolution: (1) regulatory monitoring (subscribe to regulatory change alerts), (2) impact assessment (identify which systems affected, what changes needed), (3) gap analysis (current state vs regulatory requirement), (4) remediation plan (prioritized, resourced, timeline), (5) compliance verification (test before enforcement date, third-party audit if needed).

- **Risk reporting to non-technical board**: Board doesn't understand technical risk details (CVSS scores, CVEs, technical controls). Resolution: (1) executive summary (one-page, risk heat map, financial impact, key decisions needed), (2) business language (translate technical risks to business impact — revenue, customers, reputation), (3) visual dashboards (traffic light system: red/yellow/green, trend arrows), (4) benchmarking (compare to industry peers, show relative posture), (5) decision-focused (clearly state what board needs to decide, with recommendations).

- **False sense of security from controls**: Controls are in place and effective, but new attack vector bypasses them entirely. Resolution: (1) assume breach (design security assuming attacker is already inside), (2) defense in depth (multiple layers of controls, no single point of failure), (3) red team exercises (test controls against real attackers, not just compliance checklists), (4) threat-led penetration testing (simulate specific threat actors, not generic tests), (5) continuous improvement (controls reviewed and updated quarterly, not set-and-forget).
