IT AI Skill

It Risk Management

Manage IT risks with risk assessments, risk registers, threat modeling, business impact analysis, risk treatment plans, risk reporting, and compliance alignment. Use when conducting IT risk assessments, maintaining risk registers, performing threat modeling...

IT Risk Management

Systematically identify, assess, treat, and monitor IT risks across the organization.

Workflow

  1. Establish risk framework: risk taxonomy, assessment methodology, risk appetite statement, reporting cadence.
  2. Identify risks: asset inventory, threat modeling, vulnerability assessment, dependency mapping, threat intelligence.
  3. Assess risks: likelihood × impact scoring, qualitative and quantitative analysis, business impact analysis.
  4. Treat risks: accept, mitigate, transfer, or avoid — with treatment plans, owners, timelines.
  5. Monitor risks: continuous monitoring, key risk indicators, risk trend analysis, control effectiveness testing.
  6. Report risks: executive dashboards, board reports, audit findings, regulatory disclosures.
  7. Review and improve: quarterly risk review, annual risk assessment, lessons learned from incidents.

Risk Framework

Enterprise IT Risk Architecture

IT RISK MANAGEMENT — ENTERPRISE FRAMEWORK
===========================================

Risk Framework: ISO 31000 + NIST RMF (SP 800-37) + FAIR (Factor Analysis of Information Risk)
Risk Governance: CISO (ownership) + Risk Committee (quarterly review) + Board (annual report)
Risk Assessment Frequency: Continuous monitoring + Quarterly assessment + Annual comprehensive review
Risk Register: 148 active risks (across all IT domains)

RISK TAXONOMY:
  ┌───────────────────────────────────────┬────────────┬────────────────────┬──────────────────────┐
  │ Risk Category                        │ Risks      │ % of Total         │ Avg Severity         │
  ├───────────────────────────────────────┼────────────┼────────────────────┼──────────────────────┤
  │ Cybersecurity Threats                │ 34         │ 23.0%              │ HIGH                 │
  │ Data Privacy & Compliance            │ 22         │ 14.9%              │ HIGH                 │
  │ Infrastructure & Operations          │ 28         │ 18.9%              │ MEDIUM               │
  │ Application Security                 │ 18         │ 12.2%              │ HIGH                 │
  │ Cloud & Third-Party Risk             │ 16         │ 10.8%              │ MEDIUM               │
  │ Business Continuity & DR             │ 12         │ 8.1%               │ MEDIUM               │
  │ Identity & Access Management         │ 8          │ 5.4%               │ HIGH                 │
  │ Supply Chain & Vendor Risk           │ 6          │ 4.1%               │ MEDIUM               │
  │ Emerging Technology Risk             │ 4          │ 2.7%               │ LOW                  │
  │ People & Culture                     │ 4          │ 2.7%               │ LOW                  │
  │ Financial & Cost Risk                │ 2          │ 1.4%               │ LOW                  │
  │ Regulatory & Legal                   │ 4          │ 2.7%               │ MEDIUM               │
  └───────────────────────────────────────┴────────────┴────────────────────┴──────────────────────┘
  Total active risks: 148 | New risks added (last quarter): 12 | Risks closed: 8

RISK SCORING METHODOLOGY:
  Inherent Risk = Likelihood × Impact (before controls)
  Residual Risk = Inherent Risk × (1 - Control Effectiveness) (after controls)
  
  Likelihood Scale (1-5):
    1 — Rare (< 1% annual probability, once every 10+ years)
    2 — Unlikely (1-5% annual probability, once every 5-10 years)
    3 — Possible (5-20% annual probability, once every 2-5 years)
    4 — Likely (20-50% annual probability, once every 1-2 years)
    5 — Almost Certain (> 50% annual probability, multiple times per year)
  
  Impact Scale (1-5):
    1 — Negligible (< $10K financial, < 1 hour downtime, no customer impact)
    2 — Minor ($10K-$50K, 1-4 hours downtime, limited customer impact)
    3 — Moderate ($50K-$250K, 4-12 hours downtime, significant customer impact)
    4 — Major ($250K-$1M, 12-48 hours downtime, widespread customer impact)
    5 — Catastrophic (> $1M, > 48 hours downtime, regulatory/legal consequences)
  
  Risk Matrix:
              IMPACT
              1     2     3     4     5
      5  |    L     M     H     CH    CH   |
      4  |    L     M     M     H     CH   |
 LIKELIHOOD  3  |    L     L     M     H     H    |
      2  |    L     L     L     M     H    |
      1  |    L     L     L     L     M    |
  
      L = LOW | M = MEDIUM | H = HIGH | CH = CRITICAL

RISK REGISTER SUMMARY:
  By Severity (Residual Risk):
    Critical: 4 risks (2.7%) — require immediate treatment, executive attention
    High: 18 risks (12.2%) — treatment within 30 days
    Medium: 56 risks (37.8%) — treatment within 90 days
    Low: 70 risks (47.3%) — treatment within 180 days or accept
  
  By Treatment Status:
    Open (no treatment started): 22 risks (14.9%) — new or inherited
    In Progress (treatment underway): 48 risks (32.4%) — active remediation
    Monitoring (controls in place, monitoring effectiveness): 58 risks (39.2%)
    Accepted (risk accepted by management): 14 risks (9.5%) — within risk appetite
    Closed (treatment completed, verified): 6 risks (4.1%) — resolved this quarter

RISK APPETITE STATEMENT:
  The organization's risk appetite for IT risks:
    Cybersecurity: LOW appetite (zero tolerance for data breaches, unauthorized access)
    Compliance: LOW appetite (zero tolerance for regulatory violations)
    Availability: MEDIUM appetite (accept brief, infrequent outages if they enable innovation)
    Innovation: HIGH appetite (willing to accept moderate risks for competitive advantage)
    Financial: LOW appetite (strict budget controls, no unexpected costs > $50K)
    Reputation: LOW appetite (protect brand, minimize customer-facing incidents)

Critical Risk Details

CRITICAL RISKS — DETAILED ANALYSIS
====================================

RISK #1: Ransomware Attack on Production Environment
  ┌────────────────────────────────────────────────────────────────────────────┐
  │ INHERENT RISK: CRITICAL (Likelihood: 3 × Impact: 5 = 15)                  │
  │ RESIDUAL RISK: HIGH (Likelihood: 2 × Impact: 5 × Control Effectiveness: 60%)│
  │ ANNUALIZED LOSS EXPOSURE (ALE): $420,000                                  │
  │ RISK OWNER: CISO                                                          │
  │ REVIEW DATE: Monthly                                                      │
  └────────────────────────────────────────────────────────────────────────────┘
  
  Threat Description:
    Sophisticated ransomware attack targeting production infrastructure, encrypting
    databases, file servers, and backup systems. Attacker demands $2M+ ransom within 72 hours.
  
  Attack Vectors:
    1. Phishing email with malicious attachment (most common — 45% of breaches)
    2. Exploited vulnerability in internet-facing application (30%)
    3. Compromised vendor credentials (15%)
    4. Insider threat (malicious or negligent) (5%)
    5. Supply chain compromise (5%)
  
  Impact Analysis:
    Financial: $420,000 estimated loss (downtime: $85K/day × 5 days = $425K + ransom: $2M if paid)
    Operational: Complete production outage (all services down, estimated 5-10 days to recover from backup)
    Customer: 100% of customers affected (no access to services, potential data loss)
    Regulatory: GDPR notification required within 72 hours (if PII encrypted)
    Reputation: Severe brand damage (media coverage, customer trust loss, stock impact)
  
  Existing Controls:
    1. Endpoint Detection & Response (EDR): CrowdStrike Falcon — 95% effective
    2. Network segmentation: Production isolated from user network — 80% effective
    3. Backup strategy: 3-2-1 rule (3 copies, 2 media types, 1 offsite) — 70% effective
       (offsite backup tested quarterly, last test: September 2024, successful)
    4. Email filtering: Proofpoint — 90% effective (blocks phishing before endpoint)
    5. Patch management: Critical patches within 7 days — 85% effective
    6. User awareness training: Quarterly phishing simulations — 75% effective
  
  Treatment Plan:
    1. IMMEDIATE (0-30 days):
       [✓] Deploy EDR to all endpoints (completed — 100% coverage)
       [✓] Enable immutable backups (WORM storage for critical data)
       [ ] Test ransomware recovery drill (scheduled January 2025)
       [ ] Enable network micro-segmentation (production database tier)
    
    2. SHORT-TERM (30-90 days):
       [ ] Deploy email DMARC/DKIM/SPF (reduce phishing effectiveness)
       [ ] Implement privileged access management (PAM) — just-in-time access
       [ ] Enable backup encryption (prevent backup encryption by attacker)
       [ ] Conduct red team exercise (simulated ransomware attack)
    
    3. LONG-TERM (90-180 days):
       [ ] Deploy zero-trust network architecture (eliminate lateral movement)
       [ ] Implement data loss prevention (DLP) — prevent data exfiltration
       [ ] Establish cyber insurance coverage ($10M policy)
       [ ] Deploy AI-based threat hunting (proactive detection)
    
    Estimated treatment cost: $280,000 (reduces ALE from $420K to $120K — 71% reduction)

RISK #2: Cloud Provider Regional Outage
  ┌────────────────────────────────────────────────────────────────────────────┐
  │ INHERENT RISK: HIGH (Likelihood: 2 × Impact: 5 = 10)                      │
  │ RESIDUAL RISK: MEDIUM (Likelihood: 2 × Impact: 5 × Control Effectiveness: 50%)│
  │ ANNUALIZED LOSS EXPOSURE (ALE): $150,000                                  │
  │ RISK OWNER: VP of Infrastructure                                            │
  │ REVIEW DATE: Quarterly                                                      │
  └────────────────────────────────────────────────────────────────────────────┘
  
  Threat Description:
    AWS us-east-1 regional outage (complete or partial) affecting all services
    deployed in single region. Duration: 2-24 hours (based on AWS historical outages).
  
  Impact Analysis:
    Financial: $150,000 estimated loss ($85K/day × partial day + SLA credits)
    Operational: 100% service unavailability (single region, no failover)
    Customer: All customers affected during outage window
    SLA: Breach of 99.9% availability commitment (if outage > 43 minutes/month)
  
  Existing Controls:
    1. Multi-AZ deployment: Services deployed across 3 availability zones — 80% effective
       (protects against AZ failure, not regional failure)
    2. CDN caching: CloudFront caches static content at edge — 60% effective
       (keeps read-only content available during origin outage)
    3. Auto-scaling: Handle partial capacity loss within region — 70% effective
    4. AWS Service Health Dashboard monitoring: 50% effective (awareness only, no mitigation)
  
  Treatment Plan:
    1. IMMEDIATE (0-30 days):
       [ ] Document multi-region failover runbook (step-by-step, tested)
       [ ] Configure Route 53 failover routing (DNS-based failover)
    
    2. SHORT-TERM (30-90 days):
       [ ] Deploy active-passive in secondary region (us-west-2)
       [ ] Test cross-region database replication (RDS Global Database)
       [ ] Conduct multi-region failover drill (simulate regional outage)
    
    3. LONG-TERM (90-180 days):
       [ ] Deploy active-active multi-region architecture
       [ ] Implement global load balancing (latency-based routing)
       [ ] Cross-region data synchronization (real-time, not async)
    
    Estimated treatment cost: $450,000 (active-active architecture)
    Expected ROI: Eliminates regional outage risk, reduces ALE from $150K to $15K (90% reduction)

RISK #3: Critical Data Breach (PII Exposure)
  ┌────────────────────────────────────────────────────────────────────────────┐
  │ INHERENT RISK: CRITICAL (Likelihood: 3 × Impact: 5 = 15)                  │
  │ RESIDUAL RISK: HIGH (Likelihood: 2 × Impact: 5 × Control Effectiveness: 55%)│
  │ ANNUALIZED LOSS EXPOSURE (ALE): $580,000                                  │
  │ RISK OWNER: CISO + Data Protection Officer                                 │
  │ REVIEW DATE: Monthly                                                      │
  └────────────────────────────────────────────────────────────────────────────┘
  
  Threat Description:
    Unauthorized access to customer database containing PII (names, emails, addresses,
    payment tokens). Data exfiltrated and published on dark web. 500,000+ customer records exposed.
  
  Impact Analysis:
    Financial: $580,000 estimated loss
      - Regulatory fines: $200,000 (GDPR: up to 4% of global turnover or €20M)
      - Customer notification: $150,000 (credit monitoring, legal, PR, call center)
      - Business loss: $180,000 (churn, reputational damage, litigation)
      - Forensic investigation: $50,000 (third-party incident response)
    Regulatory: GDPR notification within 72 hours, potential ICO investigation
    Customer: 500,000+ customers notified, mandatory credit monitoring for 2 years
    Reputation: Severe — media coverage, customer trust erosion, potential class-action lawsuit
  
  Existing Controls:
    1. Database encryption: AES-256 at rest — 85% effective
    2. Network segmentation: Database tier isolated — 80% effective
    3. Access controls: Role-based access, MFA required — 75% effective
    4. DLP: Data Loss Prevention monitoring — 60% effective
    5. Audit logging: All database access logged — 70% effective (detection, not prevention)
    6. WAF: Web Application Firewall — 65% effective (blocks common injection attacks)
  
  Treatment Plan:
    1. IMMEDIATE (0-30 days):
       [ ] Enable database activity monitoring (real-time alerting on unusual queries)
       [ ] Implement tokenization for sensitive fields (payment data, SSN)
    
    2. SHORT-TERM (30-90 days):
       [ ] Deploy data classification and labeling (identify all PII locations)
       [ ] Implement database firewall (block unauthorized query patterns)
       [ ] Enable query result size limits (prevent bulk data exfiltration)
    
    3. LONG-TERM (90-180 days):
       [ ] Deploy data loss prevention (DLP) with machine learning
       [ ] Implement zero-trust database access (just-in-time, just-enough privileges)
       [ ] Conduct data breach simulation (tabletop exercise with leadership)
       [ ] Establish data breach response team (pre-designated roles, contact list)
    
    Estimated treatment cost: $320,000 (reduces ALE from $580K to $180K — 69% reduction)

Business Impact Analysis

BUSINESS IMPACT ANALYSIS (BIA) — CRITICAL SYSTEMS
===================================================

BIA Methodology: Qualitative + Quantitative analysis, stakeholder interviews, historical data review
Last BIA Completed: October 2024 | Next BIA: April 2025 | BIA Owner: IT Risk Manager

CRITICAL BUSINESS FUNCTIONS:
  ┌────────────────────────────────────┬────────────┬────────────┬──────────────────┬──────────────────┐
  │ Business Function                  │ RTO (hrs)  │ RPO (hrs)  │ Max Tolerable    │ Financial Impact │
  │                                    │            │            │ Downtime (MTD)   │ Per Hour         │
  ├────────────────────────────────────┼────────────┼────────────┼──────────────────┼──────────────────┤
  │ E-commerce platform (order processing)│ 1        │ 0.5        │ 4 hours          │ $25,000          │
  │ Payment processing                 │ 0.5       │ 0          │ 2 hours          │ $40,000          │
  │ Customer authentication            │ 1         │ 0.5        │ 2 hours          │ $15,000          │
  │ Inventory management               │ 4         │ 1          │ 8 hours          │ $8,000           │
  │ Customer data (CRM)                │ 4         │ 1          │ 8 hours          │ $5,000           │
  │ Email/notification service         │ 8         │ 4          │ 24 hours         │ $3,000           │
  │ Analytics/reporting                │ 24        │ 12         │ 48 hours         │ $2,000           │
  │ Internal tools (HR, finance)       │ 8         │ 4          │ 24 hours         │ $4,000           │
  │ Developer environment              │ 24        │ 24         │ 72 hours         │ $1,000           │
  │ Documentation/wiki                 │ 48        │ 24         │ 72 hours         │ $500             │
  └────────────────────────────────────┴────────────┴────────────┴──────────────────┴──────────────────┘

  RTO = Recovery Time Objective (how quickly must system be restored)
  RPO = Recovery Point Objective (how much data loss is acceptable)
  MTD = Maximum Tolerable Downtime (absolute maximum before irreversible damage)

DEPENDENCY MAPPING (Critical Systems):
  E-commerce Platform depends on:
    → Payment Processing (hard dependency — cannot process orders without payment)
    → Customer Authentication (hard dependency — cannot identify customer)
    → Inventory Management (soft dependency — can process with cached inventory, refresh later)
    → Email Service (soft dependency — can queue emails, send when available)
  
  Payment Processing depends on:
    → Customer Authentication (hard dependency — must verify user)
    → Database (hard dependency — transaction records)
    → External Payment Gateway (hard dependency — Stripe/Adyen, out of our control)
  
  Customer Authentication depends on:
    → Database (hard dependency — user credentials)
    → Redis Cache (soft dependency — session store, can fall back to DB)
    → MFA Provider (soft dependency — can allow password-only if MFA down, with risk)

BUSINESS CONTINUITY PRIORITY:
  Tier 1 (Restore within 1 hour): Payment Processing, Customer Authentication
  Tier 2 (Restore within 4 hours): E-commerce Platform, Inventory Management, CRM
  Tier 3 (Restore within 8 hours): Email/Notification, Internal Tools
  Tier 4 (Restore within 24 hours): Analytics, Developer Environment
  Tier 5 (Restore within 48 hours): Documentation, Non-critical systems

DISASTER RECOVERY ALIGNMENT:
  Systems meeting RTO/RPO targets: 7/10 (70%)
  Systems NOT meeting targets:
    1. Payment Processing: Current RTO = 2 hours (target: 0.5 hours)
       Gap: 1.5 hours | Action: Implement active-active failover (ETA: 3 months)
    
    2. Customer Authentication: Current RPO = 1 hour (target: 0.5 hours)
       Gap: 0.5 hours | Action: Implement real-time database replication (ETA: 2 months)
    
    3. Email Service: Current RTO = 24 hours (target: 8 hours)
       Gap: 16 hours | Action: Deploy redundant email service in second region (ETA: 1 month)

Integration Points

Edge Cases