Marketing AI Skill

Growth Experiment Design

Design, execute, and analyze growth experiments including A/B tests, multivariate tests, and hypothesis-driven optimization. Use when planning growth experiments, designing A/B tests, creating experiment roadmaps, analyzing test results, or building growth...

Growth Experiment Designer

Design and execute data-driven experiments that systematically improve conversion rates, engagement, and revenue.

Workflow

  1. Formulate hypothesis: define clear, testable hypothesis using "If...then...because..." framework.
  2. Prioritize experiments: score by potential impact, confidence level, and implementation effort (ICE scoring).
  3. Design experiment: define control and variant groups, determine sample size, set success metrics.
  4. Build and QA: implement test variations; verify tracking and statistical setup.
  5. Run experiment: monitor for significance; ensure no bias; maintain test integrity.
  6. Analyze results: statistical significance analysis; segment data; identify learnings.
  7. Implement winner: roll out winning variation; document learnings; plan follow-up tests.

Experiment Design Framework

EXPERIMENT DESIGN TEMPLATE
============================

HYPOTHESIS FORMULATION:
  "If we [change], then [metric] will [increase/decrease] by [X%],
  because [rationale/research/observation]."

  Example: "If we reduce form fields from 6 to 2, then form completion rate
  will increase by 20%, because each additional field reduces conversion by ~10-15%
  (industry research) and our heatmap shows users abandon at field 4."

EXPERIMENT TYPES:
  1. A/B TEST (most common):
     → Two versions: control (current) vs. variant (proposed change)
     → Traffic split: 50/50
     → Test one variable at a time
     → Example: headline A vs. headline B

  2. MULTIVARIATE TEST:
     → Multiple variables tested simultaneously
     → Traffic split across all combinations
     → Requires larger sample size
     → Example: headline + image + CTA all tested together

  3. SPLIT URL TEST:
     → Two completely different page designs
     → Traffic split between two URLs
     → Example: current landing page vs. completely redesigned page

  4. MULTIPLE VARIANT TEST:
     → One variable, 3+ variations
     → Traffic split evenly across all variants
     → Example: CTA button text (5 variations)

SAMPLE SIZE CALCULATION:
  → Minimum sample size per group:
     Current conversion rate: ___%
     Expected improvement (MDE): ___%
     Statistical significance: 95% (standard)
     Statistical power: 80% (standard)
     Required sample per group: ___ visitors
     Use: evanmiller.org/sample-size-calculator

  → Rule of thumb:
     High-traffic pages (1,000+ daily): test for 7-14 days
     Medium-traffic pages (100-1,000 daily): test for 2-4 weeks
     Low-traffic pages (< 100 daily): test for 4-8 weeks or aggregate over time

SUCCESS CRITERIA:
  → Primary metric: [what you're testing for — conversion rate, CTR, revenue]
  → Secondary metrics: [related metrics that should also improve]
  → Guardrail metrics: [metrics that should NOT degrade — bounce rate, time on page]
  → Minimum detectable effect (MDE): [X% improvement to be considered meaningful]
  → Statistical significance threshold: 95% confidence level
  → Test duration: minimum 7 days (covers full weekly cycle)

ICE Prioritization Framework

ICE SCORING MODEL
==================

Score each experiment 1-10 on three dimensions:

IMPACT (1-10): If this works, how big is the effect?
  10 = Would dramatically change the metric (20%+ improvement expected)
  7-9 = Significant improvement (10-20% expected)
  4-6 = Moderate improvement (5-10% expected)
  1-3 = Small improvement (< 5% expected)

CONFIDENCE (1-10): How sure are we this will work?
  10 = Strong evidence (research-backed, proven pattern, data-driven insight)
  7-9 = Good evidence (best practice, similar tests worked, analytics insight)
  4-6 = Some evidence (hypothesis based on observation, limited data)
  1-3 = Low confidence (intuition only, untested approach)

EASE (1-10): How easy is this to implement?
  10 = Very easy (copy change, CSS edit, < 1 hour)
  7-9 = Easy (template change, < 1 day)
  4-6 = Moderate (requires development, 1-3 days)
  1-3 = Difficult (complex development, 1+ week)

PRIORITY SCORE = (Impact + Confidence + Ease) ÷ 3
  → 8-10: Run immediately (quick wins with high expected impact)
  → 6-7.9: Schedule for next sprint (solid experiments)
  → 4-5.9: Add to backlog (longer-term experiments)
  → < 4: Deprioritize (low expected return)

EXPERIMENT ROADMAP TEMPLATE:
  Sprint 1 (Weeks 1-2):
    → Experiment 1: [Hypothesis] — ICE Score: ___
    → Experiment 2: [Hypothesis] — ICE Score: ___
  Sprint 2 (Weeks 3-4):
    → Experiment 3: [Hypothesis] — ICE Score: ___
    → Experiment 4: [Hypothesis] — ICE Score: ___

MONTHLY TARGETS:
  Small team (1-2 people): 2-4 experiments per month
  Medium team (3-5 people): 4-8 experiments per month
  Large team (5+ people): 8-16 experiments per month

Edge Cases

Integration Points