A/B test sample size calculator
Per-variant sample size needed to detect a given lift at 95% confidence and 80% power.
Inputs
Results
n = (Z_α/2 + Z_β)² × (p₁(1-p₁) + p₂(1-p₂)) / (p₂ - p₁)²
How many sessions each variant needs before the test can be called. Below this and the result is noise.
Total = 2 × Sample per variant
Weeks = Total sample / (2 × monthly visitors per variant / 4.33)
How long the test runs at your current traffic. Aim for 2-6 weeks; longer than 8 weeks and seasonality contaminates the result.
Why this calculator is verified
This is the standard sample-size formula for a two-proportion z-test, used by every reputable A/B testing tool (Optimizely, VWO, Google Optimize before sunset, Evan Miller's canonical calculator). Z_α/2 = 1.96 fixes the test at 95% confidence (two-sided α = 0.05); Z_β = 0.84 fixes statistical power at 80%. These are the conventional defaults across the industry. The formula assumes a normal approximation to the binomial distribution, which is accurate when n × p ≥ 5 and n × (1-p) ≥ 5 (true for any conversion rate above 0.5% with the typical sample sizes this calculator returns). For very small samples or extreme conversion rates, Fisher's exact test is more rigorous; we cite the canonical references below.
Worked example
Baseline 3.5%, want to detect a 10% relative lift
p₁ = 3.5%, p₂ = 3.85% (10% lift). δ = 0.35 percentage points. Variance = 0.0338 + 0.0370 = 0.0708. n = (2.80)² × 0.0708 / (0.0035)² ≈ 45,294 per variant. At 50,000 visitors per variant per month, the test completes in ~3.9 weeks. If your traffic was 25,000 per variant, it would take ~7.8 weeks and seasonality starts to contaminate the result; in that case raise the MDE to 15-20% to bring sample size down.
Sources for the formula
- Evan Miller, sample size calculator (canonical reference)
The canonical operator reference. Same formula, same constants, same defaults.
- Wikipedia, statistical hypothesis testing
Textbook derivation of the sample-size formula for two proportions.
- Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences
The foundational text on statistical power. Defines the conventional 80% power threshold this calculator uses.
- Optimizely, sample size methodology
Vendor-published reference. Same formula and conventional defaults.
Related on Ad-Lab
Want this run on your real account?
Free 15-minute audit. We screen-share your Google Ads + GA4 + Shopify, run every number above against the real data, and leave you the audit doc whether you become a client or not.
Book the call