---
name: quality-assurance
description: "Monitor, score, and improve support agent performance through conversation sampling, quality rubrics, calibration sessions, and targeted coaching plans. Use when building QA scoring frameworks, configuring conversation sampling, setting up calibration sessions, or creating agent coaching workflows. Triggers on phrases like 'quality assurance', 'QA scoring', 'conversation sampling', 'agent coaching', 'quality rubric', 'calibration session', 'performance monitoring', 'agent scorecard', 'QA audit', 'call monitoring', 'conversation scoring'."
---

# Quality Assurance & Agent Coaching

Systematically monitor and improve support quality through AI-assisted conversation scoring, structured quality rubrics, calibration sessions, and targeted coaching interventions.

## Workflow

### Phase 1: Quality Framework Design

1. **Define quality rubric dimensions**:
   - Accuracy (25%): Correct information, proper troubleshooting, right resolution
   - Empathy & tone (20%): Active listening, acknowledgment, appropriate warmth
   - Efficiency (15%): Handle time, minimal back-and-forth, direct path to resolution
   - Process adherence (15%): Followed SOP, used correct tools, documented properly
   - Communication (15%): Clarity, professionalism, grammar, structure
   - Compliance (10%): Data handling, policy adherence, disclosure requirements
2. **Set sampling strategy**:
   - AI pre-score all conversations (100% coverage)
   - Human QA review: minimum 5 conversations per agent per month
   - Weighted sampling: low-scoring conversations over-sampled, P0/P1 always sampled
   - Random sampling: 10% of all conversations for unbiased coverage
3. **Establish scoring calibration**:
   - Weekly calibration sessions among QA team
   - Score convergence target: <5 point variance between raters
   - Golden set of benchmark conversations (50 scored by leadership)

### Phase 2: Scoring & Monitoring Pipeline

1. **AI pre-scoring** (all conversations):
   - Transcribe voice conversations (if phone channel)
   - Analyze text conversations and transcripts against rubric
   - Generate AI score per dimension + overall score
   - Flag conversations for human review (low score, policy risk, VIP customer)
2. **Human QA review** (sampled conversations):
   - QA analyst reviews full conversation + context
   - Scores against rubric with detailed comments per dimension
   - Identifies specific strengths and improvement areas
   - Flags coaching opportunities and best practice examples
3. **Score aggregation & reporting**:
   - Agent scorecard: rolling 30-day average by dimension
   - Team averages and benchmarks
   - Trend analysis: improving, stable, declining
   - Distribution analysis: identify outliers (top/bottom performers)

### Phase 3: Coaching & Improvement

1. **Coaching plan generation**:
   - Auto-generate personalized coaching recommendations based on weak dimensions
   - Link to specific training resources and example conversations
   - Set improvement targets with timeline
2. **1-on-1 coaching sessions**:
   - Monthly coaching sessions with each agent
   - Review scorecard trends and specific conversation examples
   - Role-play improvement scenarios
   - Set goals and track progress
3. **Recognition & development**:
   - Highlight top performers and best practices
   - Peer learning sessions featuring excellent conversations
   - Career development pathing based on consistent high scores
   - Corrective action process for persistent low scores

## Templates

### Quality Scoring Rubric

```
QUALITY ASSURANCE SCORING RUBRIC
=================================
Version: [4.1] | Effective: [Date]

SCORING SCALE:
  5 — Exemplary (exceeds expectations consistently)
  4 — Strong (meets expectations with minor room for improvement)
  3 — Meets (acceptable, some areas need attention)
  2 — Below (frequent gaps, coaching required)
  1 — Critical (serious issue, immediate intervention needed)

DIMENSIONS & CRITERIA:
┌────────────────────┬────────┬──────────────────────────────────────────┐
│ Dimension          │ Weight │ Scoring Criteria                         │
├────────────────────┼────────┼──────────────────────────────────────────┤
│                    │        │ 5: Information 100% accurate, thorough    │
│ Accuracy           │ 25%    │ 4: Minor inaccuracy (no customer impact)  │
│                    │        │ 3: Some gaps but resolution correct       │
│                    │        │ 2: Misinformation provided, corrected     │
│                    │        │ 1: Wrong resolution, customer harmed      │
├────────────────────┼────────┼──────────────────────────────────────────┤
│                    │        │ 5: Exceptional empathy, personalized      │
│ Empathy & Tone     │ 20%    │ 4: Warm, professional, acknowledges       │
│                    │        │ 3: Polite but generic/transactional       │
│                    │        │ 2: Flat tone, minimal acknowledgment      │
│                    │        │ 1: Dismissive, condescending, or rude     │
├────────────────────┼────────┼──────────────────────────────────────────┤
│                    │        │ 5: Minimal back-and-forth, <5 min handle  │
│ Efficiency         │ 15%    │ 4: Reasonable handle time, few loops      │
│                    │        │ 3: Some inefficiency but acceptable       │
│                    │        │ 2: Excessive handle time, multiple loops  │
│                    │        │ 1: Wasteful, could have resolved faster   │
├────────────────────┼────────┼──────────────────────────────────────────┤
│                    │        │ 5: All SOPs followed, excellent docs      │
│ Process Adherence  │ 15%    │ 4: Minor deviation (no impact)            │
│                    │        │ 3: Some steps skipped but outcome OK      │
│                    │        │ 2: Significant SOP deviation              │
│                    │        │ 1: Critical process violation             │
├────────────────────┼────────┼──────────────────────────────────────────┤
│                    │        │ 5: Crystal clear, well-structured         │
│ Communication      │ 15%    │ 4: Clear with minor grammar issues        │
│                    │        │ 3: Understandable but could be sharper    │
│                    │        │ 2: Confusing, poor structure              │
│                    │        │ 1: Incomprehensible or unprofessional     │
├────────────────────┼────────┼──────────────────────────────────────────┤
│                    │        │ 5: Exceeds compliance requirements        │
│ Compliance         │ 10%    │ 4: Fully compliant                        │
│                    │        │ 3: Minor documentation gap                │
│                    │        │ 2: Compliance lapse (corrected)           │
│                    │        │ 1: Serious compliance violation           │
└────────────────────┴────────┴──────────────────────────────────────────┘

OVERALL SCORE CALCULATION:
  Weighted sum of all dimensions → /100
  90-100: Exemplary | 80-89: Strong | 70-79: Meets | 60-69: Below | <60: Critical

MINIMUM PASSING SCORE: 70
CRITICAL FAIL (any dimension scored 1): Immediate coaching required
COMPLIANCE SCORE < 3: Automatic manager notification
```

### Agent Coaching Plan Template

```
AGENT COACHING PLAN — [Agent Name]
====================================
Employee ID: [ID] | Team: [Team] | Period: [Date Range]
Prepared by: [QA Manager] | Reviewed by: [Team Lead]

CURRENT PERFORMANCE:
  Overall QA Score: 68/100 (Team avg: 78)
  30-day trend: ↓ declining (was 74 three weeks ago)

DIMENSION BREAKDOWN:
  Accuracy:          72/100 [███████████░░░░░] — Needs attention
  Empathy & Tone:    85/100 [███████████████░] — Strong
  Efficiency:        78/100 [██████████████░░] — Acceptable
  Process Adherence: 55/100 [█████████░░░░░░░] — ⚠ BELOW TARGET
  Communication:     75/100 [█████████████░░░] — Acceptable
  Compliance:        90/100 [████████████████] — Strong

COACHING PRIORITIES:
  1. Process Adherence (priority: HIGH)
     Issue: Skipping verification step in 40% of billing tickets
     Impact: 3 compliance flags in past 30 days
     Action: Complete "Billing SOP Refresher" module by 2025-01-25
     Resource: LMS Course ID: BILL-SOP-2024-v3

  2. Accuracy (priority: MEDIUM)
     Issue: Incorrect API troubleshooting steps in 2 conversations
     Impact: Extended handle time, customer follow-up required
     Action: Review API troubleshooting playbook; shadow 2 senior agent calls
     Resource: Internal Wiki: /api-troubleshooting-guide

IMPROVEMENT TARGETS:
  30-day goal: Overall score ≥ 75 (from 68)
  60-day goal: Overall score ≥ 80 (team average)
  90-day goal: Process adherence ≥ 80 (from 55)

FOLLOW-UP SCHEDULE:
  Week 1: 1-on-1 coaching session (Jan 20) — review plan, set expectations
  Week 2: Check-in (Jan 27) — review LMS completion, shadowing feedback
  Week 4: Scorecard review (Feb 10) — assess improvement trajectory
  Week 8: Full evaluation (Feb 24) — determine next steps

ACCOUNTABILITY:
  Agent acknowledgment: [signature/date]
  Manager acknowledgment: [signature/date]
  Escalation path: If no improvement by 60 days → formal performance plan
```

## Integration Points

- **Ticketing systems**: Zendesk, Freshdesk, Intercom (conversation data)
- **QA platforms**: Maestra, Playwright, Quality Manager, Reflect
- **LMS**: Lessonly, Docebo, Cornerstone (training resources)
- **Analytics**: Tableau, Power BI (QA dashboards, trend analysis)
- **CRM**: Salesforce, HubSpot (agent performance records)
- **Communication**: Slack, Teams (coaching notifications, feedback)
- **Speech analytics**: Nice inContact, CallMiner (voice transcription, analysis)
- **HR systems**: Workday, BambooHR (performance records, development plans)
- **AI scoring**: Custom ML models, OpenAI GPT (conversation pre-scoring)

## Edge Cases

| Scenario | Handling |
|----------|----------|
| New agent (<30 days) with low scores | Use ramp-specific rubric; weight learning agility; extended coaching timeline |
| Agent disputes QA score | Formal appeal process: re-review by senior QA; calibration committee decision |
| QA scorer inconsistency across team | Weekly calibration sessions; golden set benchmarking; rater reliability tracking |
| Agent scores consistently above team but CSAT is low | Investigate gap: speed vs. quality; customer expectations mismatch |
| VIP conversation scored poorly but outcome was positive | Contextual scoring: weigh outcome alongside process; note as exception |
| Agent improvement plateaus despite coaching | Escalate to manager; consider role fit assessment; alternative development path |
| High volume prevents adequate sampling | Increase AI scoring coverage; focus human QA on flagged conversations |
| Cultural/language differences affect communication scores | Use culturally-aware rubric adjustments; native-language QA raters |

## Output

### QA Performance Dashboard

```
QUALITY ASSURANCE DASHBOARD — Monthly Summary
===============================================
Period: January 2025 | Conversations Reviewed: 1,247

TEAM QA PERFORMANCE:
  Average QA Score: 78.2/100 [███████████████░░░] (Target: 80.0)
  Median QA Score: 79.0/100
  Score distribution:
    90-100 (Exemplary):  8 agents  [20%] ██████
    80-89  (Strong):    18 agents  [45%] ██████████████████████████
    70-79  (Meets):      9 agents  [22%] ██████████
    60-69  (Below):      3 agents  [ 8%] ███
    <60    (Critical):   1 agent   [ 3%] █

DIMENSION AVERAGES:
  Accuracy:          80.1/100 [████████████████░░] ← Strongest
  Empathy & Tone:    79.3/100 [████████████████░░]
  Efficiency:        77.8/100 [███████████████░░░]
  Process Adherence: 76.2/100 [███████████████░░░] ← Needs attention
  Communication:     77.5/100 [███████████████░░░]
  Compliance:        92.4/100 [██████████████████] ← Excellent

MONTHLY TRENDS:
  Overall score: ↑ +2.1 pts (from 76.1 in December)
  Process adherence: ↓ -1.3 pts (from 77.5 in December) ⚠
  Compliance: ↑ +0.8 pts (from 91.6 in December)

TOP PERFORMERS (This Month):
  1. Sarah Kim — 94/100 | 15 conversations reviewed
  2. Tom Chen — 91/100 | 12 conversations reviewed
  3. Lisa Park — 89/100 | 18 conversations reviewed

COACHING ALERTS:
  3 agents with active coaching plans (avg improvement: +4.2 pts)
  1 agent escalated to formal performance plan (no improvement after 60 days)
  5 conversations flagged for policy review (potential compliance gaps)
```
