IT AI Skill

Service Continuity

Plan and execute business continuity and disaster recovery for IT services. Use when developing BCP/DR plans, running disaster recovery tests, managing failover processes, defining RTO/RPO targets, or coordinating recovery operations. Triggers on phrases li...

Business Continuity & Disaster Recovery

Ensure organizational resilience through comprehensive business continuity and disaster recovery planning and execution.

Workflow

1. Business Impact Analysis (BIA)

  1. Critical service identification:
  1. Impact assessment and quantification:
  1. Recovery objective definition:

2. Continuity & DR Plan Development

  1. Disaster scenario planning:
  1. Recovery strategy selection:
  1. Recovery runbook development:

3. DR Infrastructure & Technology

  1. Backup strategy and implementation:
  1. Replication and failover infrastructure:
  1. Data protection validation:

4. Testing & Exercises

  1. Testing program design:
  1. Test execution and documentation:
  1. Test types and scope:

5. Plan Maintenance & Continuous Improvement

  1. Plan review and update:
  1. Contact and resource management:
  1. Training and awareness:

Templates & Frameworks

Business Continuity Plan Summary

BUSINESS CONTINUITY PLAN — 2025
================================

CRITICAL SERVICES RECOVERY PRIORITY:
  1. Core network and DNS — RTO: 30 min, RPO: 0 min
  2. Customer-facing web applications — RTO: 1 hour, RTO: 15 min
  3. CRM and sales systems — RTO: 2 hours, RPO: 30 min
  4. Email and collaboration — RTO: 4 hours, RPO: 1 hour
  5. Internal applications — RTO: 8 hours, RPO: 4 hours
  6. Reporting and analytics — RTO: 24 hours, RPO: 8 hours

DISASTER RECOVERY SITE:
  Primary data center: [Location, Provider]
  DR site: [Location, Provider] (hot standby)
  Cloud failover: AWS us-east-1 → us-west-2
  Estimated failover time: 45 minutes (automated)

EMERGENCY CONTACT LIST:
  Incident Commander: [Name, Phone, Email]
  IT Director: [Name, Phone, Email]
  Security Lead: [Name, Phone, Email]
  Communications Lead: [Name, Phone, Email]
  Executive Sponsor: [Name, Phone, Email]
  Key Vendors: [List with emergency contact numbers]

RECOVERY DECISION FRAMEWORK:
  If single system failure → restart/patch in place
  If data center failure → failover to DR site
  If cloud region failure → failover to secondary region
  If cyberattack detected → isolate, contain, investigate, restore from clean backup
  If extended outage (>4 hours) → activate BCP, shift to remote work

DR Test Checklist

DISASTER RECOVERY TEST CHECKLIST
================================

PRE-TEST PREPARATION:
  [ ] Test scope and objectives defined
  [ ] Stakeholder notification sent (7 days advance)
  [ ] Test environment validated
  [ ] Current production state documented and backed up
  [ ] DR team briefed and assigned roles
  [ ] Communication channels tested
  [ ] Timing and measurement tools prepared

TEST EXECUTION:
  [ ] Test start time recorded: [HH:MM]
  [ ] Failover initiated — time recorded
  [ ] DNS update and propagation verified
  [ ] Critical services restored (check each service):
    [ ] Core network/DNS — restored at [HH:MM] — time: [X] min
    [ ] Web applications — restored at [HH:MM] — time: [X] min
    [ ] Database — restored at [HH:MM] — time: [X] min
    [ ] Email/collaboration — restored at [HH:MM] — time: [X] min
  [ ] Data integrity verified (RPO validation)
  [ ] Application functionality verified
  [ ] User access validated (sample test)
  [ ] Failback to primary executed (if applicable)

POST-TEST:
  [ ] Test end time recorded: [HH:MM]
  [ ] RTO achieved: [Yes/No] — Actual: [X] min vs Target: [Y] min
  [ ] RPO achieved: [Yes/No] — Actual data loss: [X] min vs Target: [Y] min
  [ ] Issues documented and categorized
  [ ] Lessons learned captured
  [ ] Improvement action items assigned
  [ ] DR plan updated based on findings
  [ ] Test report distributed to stakeholders

Integration Points

Edge Cases

Output

BCP/DR Status Dashboard

BUSINESS CONTINUITY STATUS — April 2025
=========================================

RECOVERY READINESS:
  Plan last reviewed: 2025-04-01 (current ✓)
  Last full DR test: 2025-03-15 (on schedule ✓)
  Next scheduled test: 2025-06-15 (quarterly)
  Plan version: 4.2 (distributed to 47 stakeholders)

RECOVERY OBJECTIVES STATUS:
  Service             | RTO Target | Last Test | Status
  --------------------|-----------|-----------|--------
  Core Network        | 30 min    | 22 min    | ✓
  Web Applications    | 1 hour    | 48 min    | ✓
  CRM Systems         | 2 hours   | 1h 45min  | ✓
  Email/Collaboration | 4 hours   | 3h 20min  | ✓
  Internal Apps       | 8 hours   | 6h 45min  | ✓
  Analytics           | 24 hours  | 18 hours  | ✓

BACKUP STATUS:
  Backup success rate: 99.2%
  Last successful full backup: 2025-04-15
  Backup integrity test (last): Passed ✓
  Immutable backup coverage: 100% critical systems

RECOVERY INFRASTRUCTURE:
  DR site status: Hot standby — synchronized ✓
  Replication lag: 12 seconds (target: <60 seconds ✓)
  Cloud failover ready: ✓
  DNS failover configured: ✓
  Emergency credentials: Validated ✓

CONTACT VALIDATION:
  Emergency contacts validated: 94% (3 overdue)
  Vendor contacts current: 100%
  Communication channels tested: Last 7 days ✓

IMPROVEMENT ACTIONS:
  [ ] Update DR contact list (3 overdue) — Due: April 18
  [ ] Test email failover (next quarterly test) — Due: June 15
  [ ] Renew cloud burst agreement — Due: May 30

Trigger Phrases

"business continuity", "disaster recovery", "DR plan", "BCP", "failover", "RTO", "RPO", "site recovery", "backup strategy", "failover testing", "business impact analysis", "contingency planning", "recovery plan", "drill exercise", "site failover"