---
name: data-loss-prevention
description: Prevent sensitive data exfiltration through DLP policies, data classification, monitoring of data flows, and automated enforcement. Use when discovering and classifying sensitive data, defining DLP policies, monitoring email/file/cloud data transfers, enforcing encryption requirements, investigating data leaks, or managing data lifecycle. Triggers on phrases like "DLP policy", "data classification", "prevent data leak", "sensitive data discovery", "data exfiltration", "data loss prevention", "data governance", "data lifecycle".
---

# Data Loss Prevention (DLP) & Data Governance

Protect sensitive data from unauthorized access, transfer, and exposure through comprehensive DLP strategies.

## Workflow

### 1. Data Discovery & Classification

1. **Comprehensive data inventory**:
   - Scan all data repositories: file servers, databases, cloud storage, endpoints, email
   - Identify sensitive data types: PII, PHI, PCI, IP, financial data, credentials
   - Use regex patterns, ML-based content analysis, and fingerprinting
   - Generate data inventory with location, volume, sensitivity level, and owner

2. **Classification scheme implementation**:
   - Define classification levels: Public, Internal, Confidential, Restricted
   - Map data types to classification levels
   - Auto-tag classified data with visible labels and metadata
   - Establish data owner for each classification category

3. **Continuous discovery monitoring**:
   - Scheduled re-scans (weekly for critical systems, monthly for all)
   - Alert on newly discovered sensitive data in unauthorized locations
   - Track data movement between classification zones
   - Maintain data flow maps showing sensitive data pathways

### 2. DLP Policy Definition & Enforcement

1. **Policy framework design**:
   - Define policies by channel: email, web, endpoint, cloud, mobile, print
   - Define policies by data type: PII, PCI, PHI, IP, credentials
   - Actions per violation: Block, Quarantine, Encrypt, Allow with Warning, Log Only
   - Exception management workflow for authorized data transfers

2. **Email DLP policies**:
   - Block outbound emails containing unencrypted sensitive data
   - Scan attachments and body content for data patterns
   - Encrypt emails automatically when sensitive content detected
   - Alert sender with policy education message
   - Quarantine flagged emails for security team review

3. **Endpoint DLP controls**:
   - Monitor and restrict USB/external device usage
   - Prevent copy/paste of classified data to unauthorized applications
   - Screen print jobs for sensitive content
   - Encrypt clipboard data for remote desktop sessions
   - Block unauthorized file sharing applications

4. **Cloud DLP enforcement**:
   - Monitor uploads to cloud storage (Dropbox, Google Drive, OneDrive)
   - Block uploads of sensitive data to personal cloud accounts
   - Scan cloud collaboration tools (Slack, Teams) for data leaks
   - Enforce encryption for cloud-stored sensitive data
   - Monitor SaaS application data exports

### 3. Data Flow Monitoring & Anomaly Detection

1. **Real-time data flow analysis**:
   - Monitor outbound data transfers across all channels
   - Track data volume patterns per user and department
   - Alert on unusual transfer patterns (large volumes, off-hours, new destinations)
   - User behavior analytics for baseline deviation detection

2. **Automated alerting and response**:
   - Tiered alerting: Info (log only), Warning (notify user), Critical (block + alert security)
   - Auto-quarantine suspicious transfers for investigation
   - User notification with policy explanation and guidance
   - Escalation workflow for repeated violations

3. **Investigation and forensics**:
   - Full audit trail of all DLP events and actions taken
   - Forensic analysis tools for suspected data breaches
   - Data lineage tracking: origin, transformation, destination
   - Incident response integration for confirmed data breaches

### 4. Data Encryption & Protection

1. **Encryption enforcement**:
   - Automatic encryption of sensitive data at rest
   - TLS enforcement for data in transit
   - Client-side encryption for endpoint data
   - Encrypted file sharing with recipient authentication
   - Tokenization of highly sensitive data (credit cards, SSNs)

2. **Data masking and redaction**:
   - Dynamic data masking in databases (role-based visibility)
   - Static masking for development and testing environments
   - Auto-redaction of sensitive data in logs and exports
   - Watermarking for document protection

3. **Data lifecycle management**:
   - Automated data retention policies by classification
   - Secure data deletion (cryptographic erasure, multi-pass overwrite)
   - Archive sensitive data to secure cold storage
   - Data expiration alerts and automated purging

### 5. User Training & Culture

1. **Data handling training**:
   - Role-based data handling procedures
   - Interactive training on data classification
   - Phishing simulation tied to data protection
   - Just-in-time micro-training triggered by DLP violations

2. **Violation management**:
   - First offense: coaching and training
   - Repeated violations: escalating disciplinary action
   - Track violation patterns by user and department
   - Anonymous reporting mechanism for policy concerns

## Templates & Frameworks

### Data Classification Policy

```
DATA CLASSIFICATION FRAMEWORK
==============================

LEVEL 1 — PUBLIC
  Definition: Information approved for public release
  Examples: Marketing materials, press releases, public website content
  Handling: No special controls required
  Distribution: Unlimited

LEVEL 2 — INTERNAL
  Definition: Business information for internal use only
  Examples: Internal policies, org charts, meeting notes
  Handling: Standard access controls, no external sharing without approval
  Distribution: All employees, contractors with NDA

LEVEL 3 — CONFIDENTIAL
  Definition: Sensitive business or personal data
  Examples: Customer PII, employee records, financial data, trade secrets
  Handling: Encryption at rest and in transit, RBAC, audit logging, need-to-know access
  Distribution: Authorized personnel only

LEVEL 4 — RESTRICTED
  Definition: Highly sensitive data with legal/regulatory requirements
  Examples: PHI (health data), PCI (payment cards), cryptographic keys
  Handling: Maximum encryption, dedicated systems, strict access controls, continuous monitoring
  Distribution: Minimal authorized personnel, documented business need required
```

### DLP Policy Matrix

```
DLP POLICY MATRIX
=================

Channel | Data Type   | Action    | Exception Process | Notify
--------|-------------|-----------|-------------------|--------
Email   | PII         | Encrypt   | Manager approval  | Sender + Security
Email   | PCI         | Block     | VP approval       | Sender + Security + Compliance
Web     | Credentials | Block     | None              | User + Security
Endpoint| Trade Secrets| Quarantine| CTO approval     | User + Legal
Cloud   | PHI         | Block + Alert| HIPAA officer| User + Compliance
USB     | Level 3+    | Block     | CISO approval     | User + Security
Print   | Level 3+    | Watermark| Manager approval  | User
```

## Integration Points

- DLP platforms (Microsoft Purview, Symantec DLP, Forcepoint, Digital Guardian): Policy management, scanning
- SIEM (Splunk, Sentinel, QRadar): Alert correlation, incident response
- CASB (CloudAccess Security Broker): Cloud data monitoring and protection
- Email security gateways (Proofpoint, Mimecast): Email DLP enforcement
- Endpoint protection platforms (CrowdStrike, SentinelOne): Endpoint DLP agents
- Data classification tools (BigID, Varonis): Data discovery and classification
- Encryption platforms (AWS KMS, Azure Key Vault, HashiCorp Vault): Key management
- HRIS: Employee role data for policy enforcement and termination triggers

## Edge Cases

- **Business requires data transfer that violates DLP**: Exception request workflow with documented business justification, security risk assessment, and time-bound approval
- **False positive overload**: Tune regex patterns and ML models; implement confidence thresholds; create safe lists for known-good transfers
- **Encrypted data within encrypted data**: Implement TLS inspection for email/web; endpoint-level scanning before encryption
- **Cross-border data transfer restrictions**: Geo-fencing policies for GDPR/CCPA; automatic routing through approved data transfer mechanisms
- **BYOD environment**: MDM integration for device-level DLP; containerization of corporate data on personal devices

## Output

### DLP Dashboard

```
DLP MONITORING — Real-Time
===========================

DATA INVENTORY:
  Total repositories scanned: 47
  Sensitive data repositories: 23
  Unclassified data remaining: 12 (26% — target: <10%)
  Last full scan: 2025-04-15

VIOLATION STATISTICS (Last 30 Days):
  Total events: 1,247
  Blocked: 89 (7.1%)
  Quarantined: 43 (3.4%)
  Encrypted (auto): 312 (25%)
  Warnings issued: 803 (64.4%)

TOP VIOLATION CATEGORIES:
  1. Email with unencrypted PII: 312 events
  2. USB transfer attempts: 187 events
  3. Cloud upload of confidential docs: 134 events
  4. Print jobs with sensitive data: 98 events

RISK ALERTS:
  🔴 3 users with >10 violations in 30 days — escalated to managers
  ⚠ 1 large outbound data transfer flagged for investigation (500MB to unknown destination)
  ✓ Auto-encryption rate: 94% (target: >90%)
```

## Trigger Phrases

"DLP policy", "data loss prevention", "data classification", "sensitive data discovery", "prevent data leak", "data exfiltration", "data governance", "data lifecycle", "PII protection", "encryption enforcement", "data masking", "compliance scanning", "data inventory", "data retention policy", "DLP monitoring"
