---
name: ticket-categorization
description: "Automatically categorize incoming support tickets by issue type, product area, and department for proper routing. Use when configuring NLP-based ticket classification, setting up multi-label categorization, defining custom categories, training classification models, or monitoring categorization accuracy. Triggers on phrases like 'ticket categorization', 'auto-classify tickets', 'issue classification', 'ticket tagging', 'NLP ticket routing', 'category model training', 'multi-label classification', 'issue type detection'."
---

# Ticket Categorization & Classification

Automatically classify and tag incoming support tickets using NLP and machine learning to ensure accurate routing and consistent data quality.

## Workflow

### Phase 1: Category Taxonomy Design

1. **Define primary categories** aligned with support teams:
   - Technical issues (bug, error, performance, integration)
   - Account & billing (payment, subscription, refund, upgrade)
   - Product questions (how-to, feature inquiry, documentation)
   - Feature requests & feedback
   - Security & privacy concerns
   - General inquiries
2. **Define sub-categories** for granular routing:
   - Map each primary category to specific product areas (API, UI, mobile app, admin panel)
   - Define severity indicators within categories (data loss > feature broken > cosmetic issue)
   - Create custom tags for cross-cutting concerns (compliance, VIP, escalation)
3. **Establish routing rules**:
   - Category → team queue mapping
   - Priority-based overrides (security issues always escalate)
   - Language-based routing for multilingual teams

### Phase 2: Model Training & Configuration

1. **Gather training data**:
   - Export 1,000+ historically categorized tickets from ticketing system
   - Annotate manually if historical data is insufficient
   - Balance classes to avoid bias toward high-volume categories
2. **Configure NLP pipeline**:
   - Text preprocessing: normalization, stop-word removal, stem/lemma
   - Feature extraction: TF-IDF, word embeddings (BERT, sentence-transformers)
   - Multi-label classification model training
   - Confidence threshold calibration per category
3. **Integration setup**:
   - Webhook or API connection to ticketing system (Zendesk, Freshdesk, Intercom)
   - Real-time inference pipeline (target: <2 seconds per ticket)
   - Fallback to manual categorization when confidence < threshold

### Phase 3: Live Classification & Continuous Improvement

1. **Real-time categorization**:
   - Ingest ticket content: subject line, body, attachments, customer tier, channel
   - Run classification model → assign primary category + sub-category + custom tags
   - Store confidence score alongside categorization
   - Route to appropriate queue based on category mapping
2. **Human-in-the-loop refinement**:
   - Flag low-confidence predictions for agent review
   - Capture agent recategorizations as training signals
   - Weekly model retraining with new labeled data
3. **Quality monitoring**:
   - Track categorization accuracy by category (target: >90%)
   - Monitor drift: changes in category distribution over time
   - Alert on new issue types not matching existing taxonomy

## Templates

### Category Taxonomy Configuration

```
TICKET CATEGORY TAXONOMY
=========================
Version: [2.0] | Last Updated: [Date]

PRIMARY CATEGORY          | SUB-CATEGORIES              | ROUTE TO      | CONFIDENCE THRESHOLD
--------------------------|-----------------------------|---------------|-------------------
Technical Issue           | Bug/Error                   | Engineering   | 85%
                          | Performance                 | Engineering   | 85%
                          | Integration/API             | DevOps        | 80%
                          | Data Sync/Import            | Data Team     | 80%
Account & Billing         | Payment Failed              | Billing       | 90%
                          | Subscription Change         | Billing       | 90%
                          | Refund Request              | Billing       | 95%
                          | Invoice Question            | Finance       | 85%
Product Question          | How-To / Usage              | Support L1    | 80%
                          | Feature Inquiry             | Product       | 75%
                          | Documentation Request       | Content Team  | 80%
Feature Request           | Enhancement                 | Product       | 80%
                          | New Capability              | Product       | 80%
Security & Privacy        | Data Breach Concern         | Security      | 95%
                          | Access/Permission           | IAM Team      | 85%
                          | GDPR/Privacy Request        | Legal/Privacy | 90%
General Inquiry           | Company Info                | Support L1    | 70%
                          | Careers                     | HR            | 90%
                          | Press/Media                 | PR            | 95%

ROUTING RULES:
- Any Security category → immediate escalation to Security Lead
- VIP customer + any category → senior agent queue
- Confidence < threshold → manual review queue
- Attachment detected (screenshot, log) → auto-tag "requires investigation"
```

### Classification Performance Dashboard

```
CATEGORIZATION PERFORMANCE — Weekly Report
==========================================
Reporting Period: [Date Range]

OVERALL METRICS:
  Total tickets processed: 4,823
  Auto-categorized: 4,412 (91.5%)
  Manual review required: 411 (8.5%)
  Average inference time: 1.2 seconds
  Accuracy (validated sample): 92.8%

ACCURACY BY CATEGORY:
  Technical Issue:     94.2%  [████████████████░░] 312/331
  Account & Billing:   96.1%  [██████████████████] 289/301
  Product Question:    89.7%  [███████████████░░░] 267/298
  Feature Request:     91.3%  [████████████████░░] 198/217
  Security & Privacy:  97.8%  [██████████████████]  45/46
  General Inquiry:     87.4%  [███████████████░░░] 156/179

TOP MISCATEGORIZATIONS:
  1. "API rate limiting" → Product Question (should be Technical/Integration) — 23 cases
  2. "Cancel subscription" → Account Question (should be Billing/Refund) — 18 cases
  3. "Two-factor setup" → Technical Issue (should be Security/Access) — 14 cases

ACTION ITEMS:
  [ ] Add API-related keywords to Technical Issue training set
  [ ] Adjust confidence threshold for Product Question (lowering from 80% → 75%)
  [ ] Retrain model with 50 new labeled examples for miscategorized types
```

## Integration Points

- **Ticketing systems**: Zendesk, Freshdesk, Intercom, Jira Service Desk, Help Scout, Front
- **ML platforms**: AWS SageMaker, Google Vertex AI, Azure ML, Hugging Face
- **NLP frameworks**: spaCy, NLTK, transformers (BERT, RoBERTa)
- **Analytics**: Tableau, Power BI, Looker (categorization dashboards)
- **Monitoring**: Datadog, New Relic (model drift, inference latency)
- **CRM**: Salesforce, HubSpot (customer tier, history enrichment)
- **Logging**: ELK Stack, Splunk (classification audit trail)

## Edge Cases

| Scenario | Handling |
|----------|----------|
| Ticket spans multiple categories | Multi-label classification; route to primary, tag with secondary |
| New issue type not in taxonomy | Route to "unclassified" queue; flag taxonomy owner for review |
| Low confidence on all categories | Default to manual review queue; log for model improvement |
| Multilingual tickets | Language detection → translate → classify → route to appropriate language queue |
| Spam/phishing tickets | Separate spam filter before classification; auto-archive with zero routing |
| Bulk ticket submission (100+) | Batch processing mode with rate limiting; stagger inference requests |
| Attachment-only tickets (no text) | OCR on screenshots; file-type based routing; flag for manual review |

## Output

### Live Categorization Result

```
TICKET #48291 — CATEGORIZATION RESULT
=====================================
Subject: "Can't export report to PDF, getting 500 error"
Customer: Enterprise tier | Channel: Web form | Language: English

CLASSIFICATION:
  Primary category: Technical Issue — Bug/Error [confidence: 94.7%]
  Sub-category: Performance / Export Function
  Custom tags: [requires-investigation, export-module, enterprise-customer]
  Severity indicator: Medium (functional impairment, data intact)

ROUTING:
  Assigned queue: Engineering — Backend Team
  SLA: Response within 4 hours (Enterprise tier)
  Escalation trigger: If unresolved > 24 hours → Engineering Manager

PROCESSING TIME: 0.8 seconds
```
