Support AI Skill

Sla Management

Define, monitor, and optimize service level agreements across support operations including SLA design, performance tracking, escalation management, breach prevention, and continuous SLA improvement. Use when creating SLAs, monitoring SLA compliance, managin...

SLA Management & Service Quality

Design, track, and optimize service level agreements to ensure consistent service quality and customer satisfaction.

Workflow

1. SLA Design & Definition

  1. Service level target setting:
  1. SLA scope and applicability:
  1. SLA measurement framework:

2. SLA Monitoring & Tracking

  1. Real-time SLA monitoring:
  1. Automated SLA enforcement:
  1. SLA reporting and analytics:

3. Escalation Management

  1. Escalation framework:
  1. Escalation execution:
  1. Escalation prevention:

4. SLA Breach Management

  1. Breach response and recovery:
  1. Breach analysis and prevention:
  1. Service credit management:

5. SLA Continuous Improvement

  1. SLA target optimization:
  1. Performance improvement initiatives:
  1. SLA review and renegotiation:

Templates & Frameworks

SLA Matrix

SERVICE LEVEL AGREEMENT MATRIX
===============================

PRIORITY DEFINITIONS:
  P1 (Critical): System down, no workaround, major business impact, multiple users affected
  P2 (High): System degraded, limited workaround, significant business impact
  P3 (Medium): System functional with issues, workaround available, moderate impact
  P4 (Low): Minor issue, cosmetic, informational, minimal impact

RESPONSE & RESOLUTION TARGETS:

  Priority | Response | Update   | Resolution | Business Hours
  ---------|----------|----------|------------|--------------
  P1       | 15 min   | 30 min   | 4 hours    | 24/7
  P2       | 1 hour   | 2 hours  | 8 hours    | 24/7
  P3       | 4 hours  | 8 hours  | 3 business days | Business Hours
  P4       | 8 hours  | 24 hours | 5 business days | Business Hours

SERVICE AVAILABILITY TARGETS:
  Core services: 99.9% uptime (max 43 min downtime/month)
  Standard services: 99.5% uptime (max 3.6 hours downtime/month)
  Non-critical services: 99.0% uptime (max 7.3 hours downtime/month)

SLA CLOCKS:
  Start: Ticket created and assigned
  Pause: Awaiting customer response, scheduled maintenance, vendor dependency (with notification)
  Resume: Customer response received, maintenance complete, vendor resolution
  Stop: Ticket resolved and confirmed by customer

SERVICE CREDIT TERMS:
  Core services below 99.9%: 10% monthly credit per 0.1% below target
  Core services below 99.5%: 25% monthly credit
  Core services below 99.0%: 50% monthly credit
  Maximum credit: 100% of monthly fee

SLA Breach Root Cause Template

SLA BREACH ANALYSIS — [Ticket ID, Date]
=========================================

INCIDENT OVERVIEW:
  Ticket ID: [ID]
  Priority: [P1/P2/P3/P4]
  SLA target: [X hours]
  Actual resolution time: [Y hours]
  Breach duration: [Z hours]
  Customer impact: [description]

ROOT CAUSE ANALYSIS:
  Primary cause: [staffing gap / process gap / system issue / knowledge gap / complexity]
  Contributing factors: [list]
  Timeline:
    [time] — Ticket created
    [time] — Initial response (SLA: X min, Actual: Y min)
    [time] — Escalation triggered
    [time] — Resolution attempted
    [time] — Resolution confirmed

CORRECTIVE ACTIONS:
  Immediate: [actions taken to prevent immediate recurrence]
  Short-term: [process/tool improvement — within 30 days]
  Long-term: [systemic improvement — within 90 days]

PREVENTION STRATEGY:
  Knowledge base article created/updated: [yes/no, link]
  Process change implemented: [yes/no, description]
  Training delivered: [yes/no, topic]
  Tool/system enhancement: [yes/no, description]

SERVICE CREDIT:
  Credit authorized: [yes/no]
  Credit amount: [percentage or value]
  Customer communication: [completed date]

LESSONS LEARNED:
  [Key takeaways and organizational learning]

Integration Points

Edge Cases

Output

SLA Performance Dashboard

SLA PERFORMANCE — April 2025
===============================

OVERALL SLA COMPLIANCE:
  Total tickets: 3,420
  Within SLA: 3,098 (90.6%) ⚠ (target: >92%)
  Breached: 322 (9.4%)
  Trend: ↓ 1.2% from last month

COMPLIANCE BY PRIORITY:
  P1 (Critical): 89.1% compliance ⚠ (target: >95%)
  P2 (High): 93.4% compliance ✓ (target: >92%)
  P3 (Medium): 94.7% compliance ✓ (target: >90%)
  P4 (Low): 96.2% compliance ✓ (target: >90%)

RESPONSE TIME PERFORMANCE:
  Avg P1 response: 12 min (target: <15 min ✓)
  Avg P2 response: 48 min (target: <60 min ✓)
  Avg P3 response: 3.2 hours (target: <4 hours ✓)
  Avg P4 response: 5.8 hours (target: <8 hours ✓)

RESOLUTION TIME PERFORMANCE:
  Avg P1 resolution: 3.1 hours (target: <4 hours ✓)
  Avg P2 resolution: 6.8 hours (target: <8 hours ✓)
  Avg P3 resolution: 2.4 days (target: <3 days ✓)
  Avg P4 resolution: 3.8 days (target: <5 days ✓)

ESCALATION METRICS:
  Escalations this month: 87
  Escalation-to-resolution: 4.2 hours avg
  Preventable escalations: 23 (26%)
  Customer-requested escalations: 12

SERVICE AVAILABILITY:
  Core services: 99.94% uptime ✓
  Standard services: 99.67% uptime ✓
  Non-critical: 99.31% uptime ✓
  Maintenance windows: 4 (all within schedule)

BREACH ANALYSIS:
  Breach root causes: Staffing (34%), System issue (28%), Complexity (22%), Process (16%)
  Recurring breach areas: Database issues (12%), API integration (8%)
  Service credits issued: $4,200 (0.8% of monthly revenue)

TEAM PERFORMANCE:
  Top performer: Team Alpha — 97.2% SLA compliance
  Needs improvement: Team Delta — 84.1% SLA compliance
  First contact resolution: 67% (target: >70% ⚠)