Marketing AI Skill
Growth Experiment Design
Design, execute, and analyze growth experiments including A/B tests, multivariate tests, and hypothesis-driven optimization. Use when planning growth experiments, designing A/B tests, creating experiment roadmaps, analyzing test results, or building growth...
Growth Experiment Designer
Design and execute data-driven experiments that systematically improve conversion rates, engagement, and revenue.
Workflow
- Formulate hypothesis: define clear, testable hypothesis using "If...then...because..." framework.
- Prioritize experiments: score by potential impact, confidence level, and implementation effort (ICE scoring).
- Design experiment: define control and variant groups, determine sample size, set success metrics.
- Build and QA: implement test variations; verify tracking and statistical setup.
- Run experiment: monitor for significance; ensure no bias; maintain test integrity.
- Analyze results: statistical significance analysis; segment data; identify learnings.
- Implement winner: roll out winning variation; document learnings; plan follow-up tests.
Experiment Design Framework
EXPERIMENT DESIGN TEMPLATE
============================
HYPOTHESIS FORMULATION:
"If we [change], then [metric] will [increase/decrease] by [X%],
because [rationale/research/observation]."
Example: "If we reduce form fields from 6 to 2, then form completion rate
will increase by 20%, because each additional field reduces conversion by ~10-15%
(industry research) and our heatmap shows users abandon at field 4."
EXPERIMENT TYPES:
1. A/B TEST (most common):
→ Two versions: control (current) vs. variant (proposed change)
→ Traffic split: 50/50
→ Test one variable at a time
→ Example: headline A vs. headline B
2. MULTIVARIATE TEST:
→ Multiple variables tested simultaneously
→ Traffic split across all combinations
→ Requires larger sample size
→ Example: headline + image + CTA all tested together
3. SPLIT URL TEST:
→ Two completely different page designs
→ Traffic split between two URLs
→ Example: current landing page vs. completely redesigned page
4. MULTIPLE VARIANT TEST:
→ One variable, 3+ variations
→ Traffic split evenly across all variants
→ Example: CTA button text (5 variations)
SAMPLE SIZE CALCULATION:
→ Minimum sample size per group:
Current conversion rate: ___%
Expected improvement (MDE): ___%
Statistical significance: 95% (standard)
Statistical power: 80% (standard)
Required sample per group: ___ visitors
Use: evanmiller.org/sample-size-calculator
→ Rule of thumb:
High-traffic pages (1,000+ daily): test for 7-14 days
Medium-traffic pages (100-1,000 daily): test for 2-4 weeks
Low-traffic pages (< 100 daily): test for 4-8 weeks or aggregate over time
SUCCESS CRITERIA:
→ Primary metric: [what you're testing for — conversion rate, CTR, revenue]
→ Secondary metrics: [related metrics that should also improve]
→ Guardrail metrics: [metrics that should NOT degrade — bounce rate, time on page]
→ Minimum detectable effect (MDE): [X% improvement to be considered meaningful]
→ Statistical significance threshold: 95% confidence level
→ Test duration: minimum 7 days (covers full weekly cycle)
ICE Prioritization Framework
ICE SCORING MODEL
==================
Score each experiment 1-10 on three dimensions:
IMPACT (1-10): If this works, how big is the effect?
10 = Would dramatically change the metric (20%+ improvement expected)
7-9 = Significant improvement (10-20% expected)
4-6 = Moderate improvement (5-10% expected)
1-3 = Small improvement (< 5% expected)
CONFIDENCE (1-10): How sure are we this will work?
10 = Strong evidence (research-backed, proven pattern, data-driven insight)
7-9 = Good evidence (best practice, similar tests worked, analytics insight)
4-6 = Some evidence (hypothesis based on observation, limited data)
1-3 = Low confidence (intuition only, untested approach)
EASE (1-10): How easy is this to implement?
10 = Very easy (copy change, CSS edit, < 1 hour)
7-9 = Easy (template change, < 1 day)
4-6 = Moderate (requires development, 1-3 days)
1-3 = Difficult (complex development, 1+ week)
PRIORITY SCORE = (Impact + Confidence + Ease) ÷ 3
→ 8-10: Run immediately (quick wins with high expected impact)
→ 6-7.9: Schedule for next sprint (solid experiments)
→ 4-5.9: Add to backlog (longer-term experiments)
→ < 4: Deprioritize (low expected return)
EXPERIMENT ROADMAP TEMPLATE:
Sprint 1 (Weeks 1-2):
→ Experiment 1: [Hypothesis] — ICE Score: ___
→ Experiment 2: [Hypothesis] — ICE Score: ___
Sprint 2 (Weeks 3-4):
→ Experiment 3: [Hypothesis] — ICE Score: ___
→ Experiment 4: [Hypothesis] — ICE Score: ___
MONTHLY TARGETS:
Small team (1-2 people): 2-4 experiments per month
Medium team (3-5 people): 4-8 experiments per month
Large team (5+ people): 8-16 experiments per month
Edge Cases
- Low-traffic sites (< 100 daily visitors): Run tests longer (4-8 weeks); focus on high-impact changes; consider aggregate testing across multiple pages; use Bayesian testing for earlier conclusions
- B2B with complex funnels: Test at each funnel stage separately; use CRM data for conversion tracking; account for longer decision cycles in test duration
- E-commerce with seasonality: Compare to same period last year; plan tests around peak seasons; avoid running tests during major sales events
Integration Points
- VWO / Optimizely / Convert: A/B testing platforms; visual editor; statistical analysis; segmentation; $115-$5,000+/month
- Google Analytics 4: Experiment tracking; conversion measurement; segment comparison; free
- Hotjar: Heatmaps and session recordings to generate experiment ideas; $39-$499/month
- Statsig / LaunchDarkly: Feature flagging and experimentation for product teams; custom pricing