---
name: duplicate-ticket-detection
description: "Identify and merge duplicate support tickets to prevent redundant work and maintain single source of truth. Use when configuring semantic similarity detection, setting up auto-merge rules, implementing duplicate prevention, or building conversation threading across tickets. Triggers on phrases like 'duplicate tickets', 'merge tickets', 'conversational threading', 'semantic matching', 'duplicate prevention', 'ticket consolidation', 'similar issue detection', 'conversation merge'."
---

# Duplicate Ticket Detection & Merging

Automatically identify duplicate or related tickets using semantic analysis, customer history, and issue pattern matching to reduce redundant work and ensure consistent resolution.

## Workflow

### Phase 1: Duplicate Detection Configuration

1. **Define similarity dimensions**:
   - Customer identity (same email, account ID, phone number)
   - Issue semantics (subject line similarity, description overlap using NLP embeddings)
   - Product/feature area (same module, same error code)
   - Time window (configurable: 24h, 7d, 30d)
   - Conversation context (thread continuation vs. new issue)
2. **Set confidence thresholds**:
   - Auto-merge: confidence > 90% (same customer, >85% semantic overlap, within 7 days)
   - Agent review: confidence 70-90% (flag for agent confirmation)
   - Link only: confidence 50-70% (create parent-child relationship, no merge)
   - No action: confidence < 50%
3. **Configure merge behavior**:
   - Preserve all conversation history from both tickets
   - Retain highest priority of the two
   - Merge tags and categories (union of all tags)
   - Notify customer: "We've linked this to your existing ticket"
   - Update SLA timer (use earliest creation time)

### Phase 2: Real-Time Detection Pipeline

1. **Ingestion trigger**:
   - On every new ticket creation, run duplicate check before routing
   - On every reply to existing ticket, check if reply matches another open ticket
2. **Multi-stage matching**:
   - Stage 1: Quick filter — same customer + overlapping time window (milliseconds)
   - Stage 2: Keyword matching — error codes, product names, feature references (seconds)
   - Stage 3: Semantic similarity — NLP embeddings for subject + body comparison
   - Stage 4: Context analysis — conversation thread detection, resolution status check
3. **Action execution**:
   - If duplicate detected: execute merge or link per threshold rules
   - Send notification to assigned agent on both tickets
   - Update customer on originating ticket with merge info
   - Log merge event for audit trail and analytics

### Phase 3: Post-Merge Management & Analytics

1. **Conversation consolidation**:
   - Merge conversation threads chronologically
   - Deduplicate system messages and agent notes
   - Preserve all attachments from both tickets
   - Update ticket status based on merged content
2. **Quality monitoring**:
   - Track false positive rate (incorrect merges)
   - Track false negative rate (missed duplicates)
   - Agent feedback on merge suggestions
3. **Pattern detection**:
   - Identify systemic duplicate sources (bug notifications, broadcast issues)
   - Auto-create master tickets for widespread issues
   - Proactive customer communication for known issues

## Templates

### Duplicate Detection Configuration

```
DUPLICATE DETECTION ENGINE — Configuration
===========================================
Version: [2.3] | Last Updated: [Date]

SIMILARITY DIMENSIONS:
┌──────────────────────┬──────────┬──────────────────────────────────┐
│ Dimension            │ Weight   │ Matching Logic                   │
├──────────────────────┼──────────┼──────────────────────────────────┤
│ Customer Identity    │ 30%      │ Same email/account/phone         │
│ Subject Similarity   │ 25%      │ NLP embedding cosine similarity  │
│ Description Overlap  │ 20%      │ Semantic analysis of body text   │
│ Product/Feature      │ 15%      │ Same product area + error code   │
│ Time Proximity       │ 10%      │ Within 7 days (decays linearly)  │
└──────────────────────┴──────────┴──────────────────────────────────┘

CONFIDENCE THRESHOLDS:
  90%+ → Auto-merge (no human review)
  70-89% → Agent review required (suggestion + confirm/deny)
  50-69% → Link as related (parent-child, no merge)
  <50%  → No action

TIME WINDOWS:
  P0/P1 tickets: 24-hour window (urgent issues may be separate)
  P2 tickets: 7-day window
  P3/P4 tickets: 30-day window
  VIP customers: 60-day window (higher tolerance for follow-ups)

MERGE PRESERVATION RULES:
  ✓ All conversation messages (chronological order)
  ✓ All attachments and files
  ✓ Highest priority of the two tickets
  ✓ Union of all tags and categories
  ✓ Earliest creation time (SLA protection)
  ✓ All internal notes from both agents
  ✗ Duplicate system notifications (deduplicated)

AUTO-MERGE EXCLUSIONS:
  - Security tickets (never auto-merge, always link)
  - Billing tickets with financial transactions
  - Tickets with different resolution states
  - Tickets assigned to different teams
```

### Duplicate Merge Notification

```
DUPLICATE DETECTED — Action Required
=====================================
Detection Time: 2025-01-15 10:15 UTC
Confidence Score: 92.3% → AUTO-MERGE

TICKET A (Parent):
  #48290 | Created: 2025-01-14 14:30 UTC
  Subject: "Cannot export report — error EXP-4042"
  Status: Open | Priority: P2 | Assigned: Mike Chen
  Conversation: 4 messages

TICKET B (Duplicate):
  #48325 | Created: 2025-01-15 09:45 UTC
  Subject: "Export still broken, getting EXP-4042 again"
  Status: New | Priority: P1 | Assigned: Unassigned
  Conversation: 1 message

SIMILARITY ANALYSIS:
  Same customer: ✓ (Acme Corp, account AC-10847)
  Subject similarity: 87% (both reference export + EXP-4042)
  Description overlap: 91% (same error, same feature)
  Time proximity: 19 hours (within 7-day window)
  Product/feature: ✓ (Analytics module, Report Export)

ACTION: Auto-merging #48325 into #48290
  → All messages consolidated into #48290
  → Priority escalated to P1 (higher of the two)
  → Customer notified: "We've linked your recent message to an existing ticket"
  → Agent Mike Chen notified of merged conversation
  → SLA timer continues from original creation (#48290, 2025-01-14 14:30)

AUDIT LOG:
  Merge ID: MRG-88421
  Performed by: Duplicate Detection Engine v2.3
  Undo available: Yes (within 24 hours)
```

## Integration Points

- **Ticketing systems**: Zendesk, Freshdesk, Intercom, Jira Service Desk, Front
- **NLP/ML platforms**: OpenAI embeddings, Sentence-BERT, spaCy, AWS SageMaker
- **Search**: Elasticsearch, Algolia (similar ticket search)
- **CRM**: Salesforce, HubSpot (customer identity resolution)
- **Analytics**: Tableau, Power BI (duplicate rate dashboards)
- **Notification**: Slack, email, in-app (agent and customer alerts)
- **Master ticket systems**: Jira, ServiceNow (bulk incident management)
- **Logging**: ELK Stack, Splunk (audit trail, merge history)

## Edge Cases

| Scenario | Handling |
|----------|----------|
| Two tickets from same customer about different issues | Semantic analysis distinguishes; low similarity → no merge |
| Legitimate follow-up to resolved ticket | Check resolution status; if resolved >7 days, treat as new issue |
| Customer reports issue, then reports again after partial fix | Link as related; preserve both tickets; tag "follow-up" |
| Bulk incident (100+ duplicates from system outage) | Auto-create master incident ticket; merge/link all related; broadcast update |
| Merge creates excessively long conversation (>50 messages) | Split into phase sections; preserve full history but paginate |
| Agent actively working both tickets separately | Notify both agents; suggest consolidation; allow opt-out within 1 hour |
| Cross-channel duplicate (email + chat about same issue) | Detect via customer identity + semantic match; merge regardless of channel |
| False positive merge detected by agent | Allow agent undo within 24 hours; log as feedback for model tuning |

## Output

### Duplicate Detection Dashboard

```
DUPLICATE DETECTION METRICS — Daily Summary
============================================
Date: 2025-01-15 | Tickets Processed: 892

DETECTION RESULTS:
  Duplicates detected: 67 (7.5% of total)
  Auto-merged: 52 (77.6% of detected)
  Agent-reviewed: 12 (17.9% of detected)
    → Confirmed: 10 (83.3%)
    → Rejected: 2 (16.7%)
  Linked as related: 3 (4.5% of detected)
  No action: 5 (<50% confidence)

IMPACT:
  Tickets eliminated through merging: 52
  Estimated agent time saved: 4.3 hours (avg 5 min/ticket)
  Reduction in duplicate responses: 38 customer replies avoided
  SLA protection preserved: 100% (earliest creation time used)

FALSE POSITIVE TRACKING:
  Incorrect auto-merges: 1 (1.9% of auto-merges)
  Agent-rejected suggestions: 2 (16.7% of agent reviews)
  Undo requests: 0

TOP DUPLICATE PATTERNS:
  1. EXP-4042 export error: 23 tickets merged into 3 master tickets
  2. Login timeout on mobile: 15 tickets merged into 2 master tickets
  3. Invoice discrepancy questions: 12 tickets linked (billing — no merge)
  4. Password reset not received: 8 tickets merged into 1 master ticket
  5. API rate limiting: 9 tickets merged into 1 master ticket

EFFICIENCY GAIN:
  Estimated monthly cost savings: $2,840 (agent time reduction)
  Customer wait time reduction: 2.1 minutes average
```
