---
name: infrastructure-as-code
description: Design, implement, and maintain Infrastructure as Code (IaC) using Terraform, CloudFormation, Pulumi, or similar tools. Manage state, modules, workspaces, drift detection, and IaC best practices. Use when provisioning infrastructure via code, managing Terraform state, creating reusable modules, or implementing infrastructure CI/CD pipelines. Triggers on phrases like "Infrastructure as Code", "IaC", "Terraform", "CloudFormation", "Pulumi", "state management", "terraform plan", "terraform apply", "drift detection", "module", "workspace", "remote state", "plan", "apply", "destroy", "import", "state lock".
---

# Infrastructure as Code (IaC)

Design, implement, and maintain Infrastructure as Code using Terraform, CloudFormation, and similar tools.

## Workflow

### 1. IaC Architecture & Organization

```
TERRAFORM PROJECT STRUCTURE
═══════════════════════════════════════

infrastructure/
├── modules/
│   ├── network/
│   │   ├── main.tf          (VPC, subnets, route tables)
│   │   ├── variables.tf     (input variables)
│   │   ├── outputs.tf       (output values)
│   │   └── versions.tf      (provider constraints)
│   ├── compute/
│   ├── database/
│   ├── storage/
│   └── security/
│
├── environments/
│   ├── development/
│   │   ├── main.tf          (module calls)
│   │   ├── variables.tf
│   │   ├── terraform.tfvars (env-specific values)
│   │   └── backend.tf       (state config)
│   ├── staging/
│   └── production/
│
├── scripts/
│   ├── validate.sh
│   └── policy-check.sh
│
├── policies/
│   ├── sentinel/            (HashiCorp Sentinel policies)
│   └── opa/                 (Open Policy Agent)
│
└── README.md

STATE MANAGEMENT:
═══════════════════════════════════════

Backend: S3 + DynamoDB (AWS) / GCS + lock (GCP)
═══════════════════════════════════════

  terraform {
    backend "s3" {
      bucket         = "company-terraform-state"
      key            = "production/network/terraform.tfstate"
      region         = "us-east-1"
      dynamodb_table = "terraform-lock"
      encrypt        = true
      acl            = "bucket-owner-full-control"
    }
  }

State isolation:
  → Per-environment state files
  → Per-module state (optional, for large projects)
  → Workspace separation (dev/staging/prod)
```

### 2. Module Design

```
MODULE DESIGN BEST PRACTICES
═══════════════════════════════════════

Module: VPC Network
═══════════════════════════════════════

variables.tf:
═══════════════════════════════════════

variable "vpc_cidr" {
  type        = string
  description = "CIDR block for VPC"
  validation {
    condition     = can(cidrhost(var.vpc_cidr, 0))
    error_message = "Must be valid CIDR"
  }
}

variable "environment" {
  type = string
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Must be dev, staging, or prod"
  }
}

variable "az_count" {
  type    = number
  default = 3
}

main.tf:
═══════════════════════════════════════

resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = merge(var.common_tags, {
    Name        = "${var.environment}-vpc"
    Environment = var.environment
  })
}

resource "aws_subnet" "public" {
  count             = var.az_count
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(var.vpc_cidr, 8, count.index)
  availability_zone = data.aws_availability_zones.available.names[count.index]
  map_public_ip_on_launch = true

  tags = { Name = "${var.environment}-public-${count.index + 1}" }
}

outputs.tf:
═══════════════════════════════════════

output "vpc_id" {
  description = "VPC ID"
  value       = aws_vpc.main.id
}

output "public_subnet_ids" {
  description = "Public subnet IDs"
  value       = aws_subnet.public[*].id
}

MODULE DOCUMENTATION:
═══════════════════════════════════════

  → terradoc / terraform-docs (auto-generate)
  → README.md with:
     · Usage example
     · Inputs (description, type, default, required)
     · Outputs (description, sensitive)
     · Dependencies
     · Assumptions
```

### 3. IaC CI/CD Pipeline

```
IaC CI/CD PIPELINE
═══════════════════════════════════════

GitHub Actions Workflow:
═══════════════════════════════════════

name: Terraform CI/CD
on:
  pull_request:
    paths: ["infrastructure/**"]
  push:
    branches: [main]
    paths: ["infrastructure/**"]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - checkout
      - setup-terraform
      - run: terraform fmt -check -recursive  # Format check
      - run: terraform init                   # Initialize
      - run: terraform validate               # Syntax check
      - run: terraform validate -no-color     # Validation

  security:
    needs: validate
    runs-on: ubuntu-latest
    steps:
      - run: tfsec .                          # Security scan
      - run: checkov -d .                     # Compliance check
      - run: terracognita                     # Policy check

  plan:
    needs: security
    runs-on: ubuntu-latest
    if: github.event_name == 'pull_request'
    steps:
      - run: terraform plan -out=tfplan
      - run: terraform show -json tfplan | terraform-show
      # Post plan output to PR comment

  apply:
    needs: plan
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    environment: production  # Requires approval
    steps:
      - run: terraform plan -out=tfplan
      - run: terraform apply tfplan

GUARDRAILS:
═══════════════════════════════════════

  Pre-apply checks:
    → terraform fmt: Code formatting
    → terraform validate: Syntax validity
    → tfsec: Security vulnerabilities
    → checkov: Compliance policies
    → infracost: Cost estimation (budget alert)
    → Sentinel/OPA: Custom policies

  Policies (Sentinel):
    → No public S3 buckets
    → No default security groups
    → No r5.4xlarge instances without approval
    → Tags required on all resources
    → Encryption enabled on all storage
```

### 4. State Management & Drift Detection

```
STATE MANAGEMENT
═══════════════════════════════════════

State Operations:
═══════════════════════════════════════

  Import existing resource:
    terraform import aws_instance.web i-0abc123def456

  Move resource (rename):
    terraform state mv aws_instance.old aws_instance.new

  Remove from state (orphan):
    terraform state rm aws_instance.deprecated

  List resources:
    terraform state list

  Show resource details:
    terraform state show aws_instance.web

DRIFT DETECTION:
═══════════════════════════════════════

  Scheduled drift detection (nightly):
    terraform plan -detailed-exitcode

  Exit codes:
    0 = No changes
    1 = Error
    2 = Changes detected (drift)

  Notification on drift:
    → Slack alert to #infra-drift
    → Include plan output (what changed)
    → Auto-ticket creation (Jira)

  Drift remediation:
    → Option 1: terraform apply (overwrite manual changes)
    → Option 2: Update code to match (adopt manual changes)
    → Option 3: Import new state (for new resources)
```

### 5. IaC Best Practices

```
IaC BEST PRACTICES
═══════════════════════════════════════

DO:
═══════════════════════════════════════

  → Use modules for reusability
  → Version all providers and modules
  → Store state remotely with locking
  → Encrypt state files
  → Use variables (no hardcoded values)
  → Use workspaces or separate directories per environment
  → Document modules (inputs, outputs, usage)
  → Use null resources for triggers
  → Implement guardrails (policies)
  → Review terraform plan before apply
  → Use refresh-only plan to detect drift
  → Pin provider versions (compatibility)

DON'T:
═══════════════════════════════════════

  → Hardcode secrets (use var files or vault)
  → Use count/index for complex logic (use for_each)
  → Commit state files to version control
  → Use root module for everything
  → Skip terraform plan (always review first)
  → Share state files between environments
  → Use * wildcards in provider versions
  → Ignore dependency warnings
  → Apply without testing in staging first
```

## Edge Cases

- **Large state files**: Split into multiple state files
- **State corruption**: Backup and restore procedures
- **Provider migration**: State migration between providers
- **Multi-cloud**: Cross-cloud resource management
- **Import existing**: Migrating manual infrastructure to IaC

## Integration Points

- **IaC tools**: Terraform, CloudFormation, Pulumi
- **CI/CD**: GitHub Actions, GitLab CI, Jenkins
- **Policy**: Sentinel, OPA, Conftest
- **Security**: tfsec, checkov, terrascan
- **Cost**: infracost, cloud-nativetools
- **State**: S3, GCS, Terraform Cloud, Atlas

## Output

### IaC Status

```
INFRASTRUCTURE AS CODE STATUS
═══════════════════════════════════════

Modules: 15 (network, compute, database, etc.)
Environments: 3 (dev, staging, prod)
Drift: 0 resources (all in sync)
Security: 0 critical, 2 warnings (remediating)
State: Remote (S3 + DynamoDB lock)
CI/CD: GitHub Actions (automated plan/apply)
Coverage: 95% of infrastructure managed by IaC
```
