Marketing AI Skill

Growth Experiment Design

Design, execute, and analyze growth experiments including A/B tests, multivariate tests, and hypothesis-driven optimization. Use when planning growth experiments, designing A/B tests, creating experiment roadmaps, analyzing test results, or building growth loops. Triggers on phrases like "growth experiment", "A/B test", "split test", "experiment design", "hypothesis testing", "growth loop", "experiment roadmap", "test analysis".

Growth Experiment Designer

Design and execute data-driven experiments that systematically improve conversion rates, engagement, and revenue.

Workflow

Formulate hypothesis: define clear, testable hypothesis using "If...then...because..." framework.
Prioritize experiments: score by potential impact, confidence level, and implementation effort (ICE scoring).
Design experiment: define control and variant groups, determine sample size, set success metrics.
Build and QA: implement test variations; verify tracking and statistical setup.
Run experiment: monitor for significance; ensure no bias; maintain test integrity.
Analyze results: statistical significance analysis; segment data; identify learnings.
Implement winner: roll out winning variation; document learnings; plan follow-up tests.

Experiment Design Framework

EXPERIMENT DESIGN TEMPLATE
============================

HYPOTHESIS FORMULATION:
  "If we [change], then [metric] will [increase/decrease] by [X%],
  because [rationale/research/observation]."

  Example: "If we reduce form fields from 6 to 2, then form completion rate
  will increase by 20%, because each additional field reduces conversion by ~10-15%
  (industry research) and our heatmap shows users abandon at field 4."

EXPERIMENT TYPES:
  1. A/B TEST (most common):
     → Two versions: control (current) vs. variant (proposed change)
     → Traffic split: 50/50
     → Test one variable at a time
     → Example: headline A vs. headline B

  2. MULTIVARIATE TEST:
     → Multiple variables tested simultaneously
     → Traffic split across all combinations
     → Requires larger sample size
     → Example: headline + image + CTA all tested together

  3. SPLIT URL TEST:
     → Two completely different page designs
     → Traffic split between two URLs
     → Example: current landing page vs. completely redesigned page

  4. MULTIPLE VARIANT TEST:
     → One variable, 3+ variations
     → Traffic split evenly across all variants
     → Example: CTA button text (5 variations)

SAMPLE SIZE CALCULATION:
  → Minimum sample size per group:
     Current conversion rate: ___%
     Expected improvement (MDE): ___%
     Statistical significance: 95% (standard)
     Statistical power: 80% (standard)
     Required sample per group: ___ visitors
     Use: evanmiller.org/sample-size-calculator

  → Rule of thumb:
     High-traffic pages (1,000+ daily): test for 7-14 days
     Medium-traffic pages (100-1,000 daily): test for 2-4 weeks
     Low-traffic pages (< 100 daily): test for 4-8 weeks or aggregate over time

SUCCESS CRITERIA:
  → Primary metric: [what you're testing for — conversion rate, CTR, revenue]
  → Secondary metrics: [related metrics that should also improve]
  → Guardrail metrics: [metrics that should NOT degrade — bounce rate, time on page]
  → Minimum detectable effect (MDE): [X% improvement to be considered meaningful]
  → Statistical significance threshold: 95% confidence level
  → Test duration: minimum 7 days (covers full weekly cycle)

ICE Prioritization Framework

ICE SCORING MODEL
==================

Score each experiment 1-10 on three dimensions:

IMPACT (1-10): If this works, how big is the effect?
  10 = Would dramatically change the metric (20%+ improvement expected)
  7-9 = Significant improvement (10-20% expected)
  4-6 = Moderate improvement (5-10% expected)
  1-3 = Small improvement (< 5% expected)

CONFIDENCE (1-10): How sure are we this will work?
  10 = Strong evidence (research-backed, proven pattern, data-driven insight)
  7-9 = Good evidence (best practice, similar tests worked, analytics insight)
  4-6 = Some evidence (hypothesis based on observation, limited data)
  1-3 = Low confidence (intuition only, untested approach)

EASE (1-10): How easy is this to implement?
  10 = Very easy (copy change, CSS edit, < 1 hour)
  7-9 = Easy (template change, < 1 day)
  4-6 = Moderate (requires development, 1-3 days)
  1-3 = Difficult (complex development, 1+ week)

PRIORITY SCORE = (Impact + Confidence + Ease) ÷ 3
  → 8-10: Run immediately (quick wins with high expected impact)
  → 6-7.9: Schedule for next sprint (solid experiments)
  → 4-5.9: Add to backlog (longer-term experiments)
  → < 4: Deprioritize (low expected return)

EXPERIMENT ROADMAP TEMPLATE:
  Sprint 1 (Weeks 1-2):
    → Experiment 1: [Hypothesis] — ICE Score: ___
    → Experiment 2: [Hypothesis] — ICE Score: ___
  Sprint 2 (Weeks 3-4):
    → Experiment 3: [Hypothesis] — ICE Score: ___
    → Experiment 4: [Hypothesis] — ICE Score: ___

MONTHLY TARGETS:
  Small team (1-2 people): 2-4 experiments per month
  Medium team (3-5 people): 4-8 experiments per month
  Large team (5+ people): 8-16 experiments per month

Edge Cases

Low-traffic sites (< 100 daily visitors): Run tests longer (4-8 weeks); focus on high-impact changes; consider aggregate testing across multiple pages; use Bayesian testing for earlier conclusions
B2B with complex funnels: Test at each funnel stage separately; use CRM data for conversion tracking; account for longer decision cycles in test duration
E-commerce with seasonality: Compare to same period last year; plan tests around peak seasons; avoid running tests during major sales events

Integration Points

VWO / Optimizely / Convert: A/B testing platforms; visual editor; statistical analysis; segmentation; $115-$5,000+/month
Google Analytics 4: Experiment tracking; conversion measurement; segment comparison; free
Hotjar: Heatmaps and session recordings to generate experiment ideas; $39-$499/month
Statsig / LaunchDarkly: Feature flagging and experimentation for product teams; custom pricing

Disclaimer: All rights reserved by Circulos AI. These skills are specifically designed for Claude Code, Claude Cowork, Codex, and OpenClaw. When using or referencing any skill, please provide proper attribution to Circulos AI.