---
name: threat-detection-response
description: Manage real-time threat detection and incident response including SIEM operations, threat intelligence integration, AI-powered anomaly detection, automated incident response playbooks, forensic analysis, and post-incident reporting. Use when detecting security threats, managing security incidents, analyzing attack patterns, executing response playbooks, or conducting post-incident analysis. Triggers on phrases like "threat detection", "incident response", "SIEM", "SOC", "threat intelligence", "anomaly detection", "incident playbook", "forensic analysis", "IOCs", "attack detection", "security alert", "breach response", "ransomware detection".
---

# Threat Detection & Incident Response

Detect and respond to security threats in real-time using AI-powered analysis, automated playbooks, and coordinated incident response.

## SIEM Operations & Threat Detection

### Security Operations Center (SOC) Framework

```
SOC OPERATIONS FRAMEWORK:
══════════════════════════

SIEM PLATFORM: Microsoft Sentinel (cloud-native)
  Data connectors: 35 active
  Log volume: 2.5 TB/day (avg.)
  Alert volume: 450-600/day (raw) → 15-25/day (correlated incidents)
  Retention: 90 days (hot) + 1 year (cold archive)
  Analytics rules: 128 active
  Workbooks: 15 custom dashboards
  Playbooks (Logic Apps): 22 automated responses

DATA SOURCES:
  ┌──────────────────────────┬───────────────────────────────────┬──────────┐
  │ Source                   │ Data Collected                    │ Volume   │
  ├──────────────────────────┼───────────────────────────────────┼──────────┤
  │ Azure AD / Entra ID      │ Sign-in logs, audit logs, risk    │ 850K/day │
  │ AWS CloudTrail           │ API calls, config changes, auth   │ 420K/day │
  │ AWS VPC Flow Logs        │ Network traffic metadata          │ 1.2M/day │
  │ CrowdStrike (EDR)        │ Endpoint events, malware, process │ 650K/day │
  │ Palo Alto (firewall)     │ Network traffic, threat logs      │ 980K/day │
  │ Okta (IAM)               │ Authentication, session, MFA      │ 280K/day │
  │ Kubernetes audit         │ Pod events, RBAC, network policy  │ 180K/day │
  │ GitHub/GitLab            │ Repository access, commits, PR    │ 45K/day  │
  │ SaaS apps (Salesforce,  │ User activity, API access, export │ 120K/day │
  │  Slack, GSuite)          │                                   │          │
  │ Email (Microsoft 365)    │ Phishing, malware, DLP            │ 95K/day  │
  │ WAF (CloudFront/WAF)     │ Web attack attempts, DDoS         │ 75K/day  │
  │ DNS logs                 │ DNS queries, suspicious domains   │ 500K/day │
  │ ────────────────────── │ ───────────────────────────────── │ ───────── │
  │ TOTAL                   │ Multiple sources                 │ ~5.2M/day│
  └──────────────────────────┴───────────────────────────────────┴──────────┘

THREAT DETECTION RULES:
  ┌──────────────────────────────────┬──────────┬──────────┬────────────┐
  │ Rule Category                    │ Count    │ Triggers │ FPR Rate   │
  ├──────────────────────────────────┼──────────┼──────────┼────────────┤
  │ Intrusion detection              │ 18       │ 45/day   │ 12%        │
  │ Malware detection                │ 12       │ 28/day   │ 5%         │
  │ Phishing/social engineering     │ 8        │ 120/day  │ 35%        │
  │ Data exfiltration              │ 6        │ 15/day   │ 8%         │
  │ Privilege escalation            │ 10       │ 35/day   │ 15%        │
  │ Brute force / credential stuff  │ 8        │ 250/day  │ 45%        │
  │ Lateral movement                │ 7        │ 22/day   │ 10%        │
  │ Command & control               │ 5        │ 8/day    │ 3%         │
  │ Insider threat                  │ 6        │ 18/day   │ 20%        │
  │ DDoS / availability             │ 4        │ 5/day    │ 2%         │
  │ Cloud misconfiguration          │ 12       │ 40/day   │ 18%        │
  │ Container security              │ 8        │ 12/day   │ 7%         │
  │ API abuse                       │ 6        │ 25/day   │ 22%        │
  │ Configuration drift             │ 15       │ 38/day   │ 25%        │
  │ Compliance violation            │ 8        │ 20/day   │ 5%         │
  │ ─────────────────────────── │ ────── │ ────── │ ───────── │
  │ TOTAL                         │ 128    │ ~641/day│ Avg: 15%   │
  └──────────────────────────────────┴──────────┴──────────┴────────────┘

  Notes:
    - FPR = False Positive Rate (lower is better)
    - Rules with >30% FPR: Tuning in progress
    - AI/ML rules: 22 rules (adaptive thresholding)

THREAT INTEL INTEGRATION:
  Feed sources:
    - Microsoft Threat Intelligence (native)
    - MITRE ATT&CK framework mapping
    - AlienVault OTX (open source)
    - Recorded Future (commercial)
    - Industry ISAC (financial sector)
    - Internal IOCs (custom, from incidents)
  
  IOCs tracked:
    - Malicious IPs: 125K+ (updated hourly)
    - Malicious domains: 85K+ (updated hourly)
    - Malicious hashes (file): 45K+ (updated daily)
    - Suspicious URLs: 32K+ (updated hourly)
    - Email indicators: 8K+ (updated daily)

DETECTION RESULTS (January 2025):
  Raw alerts: 580/day (avg.)
  Correlated incidents: 18/day (avg.)
  False positives (estimated): 3/day (17%)
  True security incidents: 15/day (avg.)
  
  Incident severity breakdown:
    Critical: 0/day (avg.) ✓
    High: 0.5/day (avg.)
    Medium: 2.5/day (avg.)
    Low: 12/day (avg.)
  
  Mean Time to Detect (MTTD): 4.2 minutes (target: <5 min) ✓
  Mean Time to Triage (MTTT): 8.5 minutes (target: <10 min) ✓
```

## Automated Incident Response

### Response Playbooks

```
INCIDENT RESPONSE PLAYBOOKS:
═════════════════════════════

PLAYBOOK INVENTORY (22 Active):
  ┌──────────────────────────────────┬──────────┬──────────┬──────────────┐
  │ Playbook                        │ Severity │ Auto/Manual│ Avg Duration │
  ├──────────────────────────────────┼──────────┼───────────┼──────────────┤
  │ PB-01: Malware Detection         │ High     │ 80% auto   │ 15 min       │
  │ PB-02: Phishing Response         │ Medium   │ 60% auto   │ 25 min       │
  │ PB-03: Brute Force / Account     │ Medium   │ 70% auto   │ 10 min       │
  │   Compromise                    │          │           │              │
  │ PB-04: Data Exfiltration         │ Critical │ 50% auto   │ 45 min       │
  │ PB-05: Ransomware Detection      │ Critical │ 90% auto   │ 5 min        │
  │ PB-06: Privilege Escalation      │ High     │ 60% auto   │ 30 min       │
  │ PB-07: Insider Threat            │ High     │ 40% auto   │ 60 min       │
  │ PB-08: DDoS Mitigation           │ High     │ 95% auto   │ 3 min        │
  │ PB-09: Cloud Misconfig           │ Medium   │ 70% auto   │ 20 min       │
  │ PB-10: Container Compromise      │ High     │ 75% auto   │ 20 min       │
  │ PB-11: API Abuse                │ Medium   │ 60% auto   │ 15 min       │
  │ PB-12: DNS Tunneling             │ High     │ 70% auto   │ 25 min       │
  │ PB-13: Credential Stuffing       │ Medium   │ 80% auto   │ 8 min        │
  │ PB-14: Crypto Mining             │ Medium   │ 75% auto   │ 10 min       │
  │ PB-15: Shadow IT Detection       │ Low      │ 50% auto   │ 30 min       │
  │ PB-16: Compliance Violation      │ Low      │ 60% auto   │ 20 min       │
  │ PB-17: Configuration Drift       │ Low      │ 70% auto   │ 15 min       │
  │ PB-18: Sensitive Data Exposure   │ High     │ 50% auto   │ 40 min       │
  │ PB-19: Supply Chain Attack       │ Critical │ 30% auto   │ 90 min       │
  │ PB-20: Zero-day Exploit          │ Critical │ 25% auto   │ 120 min      │
  │ PB-21: Business Email Compromise │ High     │ 40% auto   │ 45 min       │
  │ PB-22: System Compromise         │ Critical │ 35% auto   │ 180 min      │
  └──────────────────────────────────┴──────────┴───────────┴──────────────┘

DETAILED PLAYBOOK: PB-01 (Malware Detection):
  Trigger: EDR detects malware on endpoint
  
  Automated actions (first 5 minutes):
    1. Quarantine infected endpoint (network isolation)
    2. Collect forensic evidence (memory dump, disk snapshot)
    3. Extract malware sample (hash, IOCs)
    4. Identify lateral movement indicators (same user, network)
    5. Check for persistence mechanisms (scheduled tasks, services)
    6. Update threat intelligence (new IOCs)
    7. Block related IOCs (firewall, proxy, email)
    8. Create incident ticket (ITSM integration)
    9. Notify SOC analyst (Teams alert, severity: HIGH)
  
  Manual actions (analyst):
    10. Validate detection (true positive confirmation)
    11. Scope assessment (affected users, systems, data)
    12. Determine malware family and TTPs (MITRE mapping)
    13. Identify initial access vector
    14. Determine data exposure (if any)
    15. Execute cleanup (malware removal, system rebuild)
    16. Verify remediation (re-scan, monitoring)
    17. Update incident report
    18. Conduct lessons learned

INCIDENT SEVERITY CLASSIFICATION:
  Critical (P1):
    - Active data breach or exfiltration
    - Ransomware or destructive malware
    - Complete system compromise (C2 established)
    - Executive-level account compromise
    - SLA: Immediate response, 1-hour update, 4-hour resolution
  
  High (P2):
    - Malware detected and contained
    - Privilege escalation (contained)
    - Multiple account compromise indicators
    - SLA: 15-minute response, 2-hour update, 8-hour resolution
  
  Medium (P3):
    - Phishing attempt (not clicked)
    - Policy violation detected
    - Suspicious activity (investigation needed)
    - SLA: 1-hour response, 4-hour update, 24-hour resolution
  
  Low (P4):
    - Informational alert
    - Configuration issue (security)
    - Compliance advisory
    - SLA: 4-hour response, next business day resolution

INCIDENT STATISTICS (January 2025):
  Total incidents: 480 (15/day avg.)
  ┌─────────────────────────┬──────────┬──────────┐
  │ Severity                │ Count    │ %        │
  ├─────────────────────────┼──────────┼──────────┤
  │ Critical                │ 0        │ 0%       │
  │ High                    │ 15       │ 3.1%     │
  │ Medium                  │ 120      │ 25.0%    │
  │ Low                     │ 345      │ 71.9%    │
  │ ───────────────────── │ ────── │ ────── │
  │ TOTAL                 │ 480    │ 100%   │
  └─────────────────────────┴──────────┴──────────┘

  Resolution times:
    Critical: N/A (0 incidents)
    High: Avg. 4.2 hours (target: <8 hours) ✓
    Medium: Avg. 6.5 hours (target: <24 hours) ✓
    Low: Avg. 3.1 hours (target: <24 hours) ✓
  
  MTTR (Mean Time to Resolve): 5.8 hours (all severities)
  Trend: Improving (March: 8.2h → Sep: 7.1h → Jan: 5.8h)

ESCALATION MATRIX:
  Level 1 (SOC Analyst):
    - Triage and initial investigation
    - Run standard playbooks (automated)
    - Resolve low/medium incidents
    - Escalate to Level 2 if needed
  
  Level 2 (SOC Senior Analyst):
    - Complex investigation (forensic)
    - Custom playbook execution
    - Escalate to Level 3 if critical
    - Coordinate with IT, HR, Legal as needed
  
  Level 3 (SOC Manager / Security Lead):
    - Critical incident management
    - Cross-team coordination
    - Executive communication
    - Regulatory notification (if breach)
    - External forensics engagement (if needed)
  
  Level 4 (CISO / Executive):
    - Data breach notification
    - Board/audit committee notification
    - Customer/partner notification
    - Regulatory reporting
    - Media/PR coordination
```

## Forensic Analysis & Investigation

### Deep Investigation Framework

```
FORENSIC INVESTIGATION FRAMEWORK:
══════════════════════════════════

INVESTIGATION METHODOLOGY (NIST SP 800-61):
  Phase 1: Preparation
    - Forensic tools ready (disk imaging, memory capture)
    - Chain of custody procedures
    - Legal hold framework
    - Evidence storage (encrypted, access-controlled)
    - Investigator training (certified)
  
  Phase 2: Detection & Analysis
    - Alert triage and validation
    - Initial scope assessment
    - Evidence preservation (legal hold)
    - Timeline creation (attack progression)
  
  Phase 3: Containment, Eradication, Recovery
    - Containment (prevent spread)
    - Eradication (remove threat)
    - Recovery (restore systems)
    - Validation (verify clean)
  
  Phase 4: Post-Incident Activity
    - Root cause analysis
    - Lessons learned
    - Process improvement
    - Evidence retention
    - Metrics update

EVIDENCE COLLECTION PROTOCOL:
  Digital evidence types:
    - Disk images (full or targeted)
    - Memory dumps (volatile data)
    - Network captures (PCAP)
    - Log files (SIEM, system, application)
    - Email archives
    - Cloud trail logs
    - DNS query logs
    - Browser history
    - Registry hives (Windows)
    - File system metadata
  
  Chain of custody:
    1. Evidence identified and tagged
    2. Hash calculated (SHA-256)
    3. Evidence copied (write-blocked)
    4. Original preserved (untouched)
    5. Copy used for analysis
    6. Hash verified (integrity)
    7. Analysis documented (timeline)
    8. Evidence stored (encrypted, access-controlled)
    9. Retention per policy (7 years minimum)

FORENSIC TOOLS:
  Commercial:
    - Magnet AXIOM (digital forensics)
    - EnCase (disk analysis)
    - Wireshark (network analysis)
    - Volatility (memory forensics)
    - Splunk (log analysis)
  Open source:
    - Autopsy (disk analysis)
    - YARA (malware detection rules)
    - Velociraptor (endpoint forensics)
    - Log2Timeline (timeline analysis)
  
  Cloud forensics:
    - AWS: CloudTrail, VPC Flow Logs, GuardDuty
    - Azure: Azure Monitor, Network Watcher, Defender
    - GCP: Cloud Audit Logs, VPC Flow Logs

ATTACK ANALYSIS (MITRE ATT&CK):
  Framework: Map incident TTPs to MITRE ATT&CK
  Purpose: Understand attack sophistication, identify gaps
  
  Common TTPs observed (January 2025):
    - T1566: Phishing (22 incidents)
    - T1190: Exploit Public-Facing Application (3 incidents)
    - T1078: Valid Accounts (8 incidents)
    - T1059: Command and Scripting Interpreter (5 incidents)
    - T1041: Exfiltration Over C2 Channel (0 incidents) ✓
    - T1486: Data Encrypted for Impact (0 incidents) ✓
    - T1053: Scheduled Task/Job (4 incidents)
    - T1021: Remote Services (2 incidents)
    - T1071: Application Layer Protocol (1 incident)
    - T1560: Archive Collected Data (0 incidents) ✓

  Coverage gap analysis:
    - Detection gaps identified: 3 (TTPs not covered)
    - New analytics rules created: 3
    - Testing: Purple team exercise (quarterly)
    - Status: All gaps addressed (January)

POST-INCIDENT REPORT TEMPLATE:
  1. Executive summary (1 page)
  2. Timeline of events (detailed)
  3. Root cause analysis (5-why method)
  4. Impact assessment (scope, severity)
  5. Containment actions (what was done)
  6. Remediation steps (fix applied)
  7. Lessons learned (process improvements)
  8. Recommendations (preventive measures)
  9. MITRE ATT&CK mapping (TTPs identified)
  10. Appendices (evidence, technical details)
  
  Distribution:
    Internal: Security team, IT, Legal, Executive
    External: Regulators (if required), customers (if breach)
    Retention: 7 years minimum (encrypted)
```

## Output

### SOC Operations Dashboard

```
SOC OPERATIONS DASHBOARD — Jan 2025
══════════════════════════════════

Detection:
  SIEM platform: Microsoft Sentinel
  Data sources: 35 connectors
  Log volume: 2.5 TB/day
  Raw alerts: 580/day → 18 incidents/day
  MTTD: 4.2 min (target: <5 min) ✓
  MTTT: 8.5 min (target: <10 min) ✓

Incidents (January):
  Total: 480 incidents
  Critical: 0 ✓
  High: 15 (3.1%)
  Medium: 120 (25.0%)
  Low: 345 (71.9%)
  MTTR: 5.8 hours (improving trend)
  Resolution rate: 100% (all closed)

Response:
  Playbooks active: 22
  Automation rate: 65% avg. (varies by playbook)
  Escalation rate: 8% (38 incidents to L2/L3)
  False positive rate: 17% (target: <20%) ✓

Threat Intel:
  IOCs tracked: 290K+ (updated hourly/daily)
  Feed sources: 6 (commercial + open source)
  Custom IOCs: 145 (from internal incidents)
  MITRE ATT&CK mapping: Complete

Forensics:
  Investigations (January): 8
  Chain of custody: 100% maintained
  Evidence retention: 7 years (encrypted)
  Tool coverage: Commercial + open source + cloud

Compliance:
  Breach notification: 0 required (no breach)
  Regulatory reporting: 0 (no incident)
  Audit evidence: Complete (SOC 2, ISO 27001)
  Data retention: 90 days hot + 1 year cold

Actions:
  1. Rule tuning (5 rules with >30% FPR)
  2. MITRE gap closure (3 new rules — testing)
  3. Purple team exercise (quarterly — Feb)
  4. Playbook update (incident feedback)
  5. SOC analyst training (annual — Q1)
```

## Integration Points

- SIEM platforms (Sentinel, Splunk, QRadar): Log aggregation, correlation, analytics
- EDR/XDR platforms (CrowdStrike, SentinelOne): Endpoint detection, response
- Firewall/NGFW (Palo Alto, Fortinet, Zscaler): Network security, threat blocking
- Threat intelligence (Recorded Future, MISP, OTX): IOCs, TTPs, threat feeds
- ITSM platforms (ServiceNow, Jira): Incident tracking, change management
- Identity platforms (Okta, Azure AD, CyberArk): Account disablement, MFA reset
- Orchestration (SOAR) platforms (Cortana, Phantom): Automated playbooks
- Email security (Proofpoint, M365 Defender): Phishing detection, takedown
- Cloud security (GuardDuty, Defender for Cloud): Cloud-specific detection
- Communication (Teams, Slack, PagerDuty): Alerting, escalation, coordination
- Forensic platforms (Magnet AXIOM, EnCase): Deep investigation, evidence
- CMDB: Asset context, ownership, criticality

## Edge Cases

- **Zero-day exploit**: No existing detection rule; behavioral analysis; vendor coordination; emergency patch; compensating control
- **Advanced persistent threat (APT)**: Long-duration investigation; multi-vector; threat hunting; external forensics engagement
- **Ransomware**: Immediate isolation; backup verification; decryption assessment; business continuity activation
- **Insider threat**: Sensitive investigation; HR coordination; legal hold; evidence preservation; termination protocol
- **Data breach**: Regulatory notification (GDPR, CCPA); customer notification; PR coordination; legal engagement; forensics
- **Supply chain attack**: Vendor coordination; scope assessment; dependency review; emergency update
- **False positive storm**: Rule tuning; alert fatigue prevention; threshold adjustment; analyst support
- **SIEM outage**: Backup SIEM; log forwarding to secondary; manual monitoring; escalation process
- **Cloud provider outage**: Cross-region failover; manual detection; alternate monitoring; provider communication
- **Insufficient evidence**: Extended investigation; data retention check; legal guidance; case documentation
