m.02 · I · Quantifying Uncertainty · Business Experimentation

A/B Testing & Hypothesis Logic

The only way to prove causation is to randomise.

The core idea

Correlation can never prove causation. A randomised experiment can. By assigning users to treatment and control by coin-flip, you make the two groups identical on average — so any difference in outcome has to come from the treatment. That one trick, first used by John Snow in 1854, is the backbone of modern A/B testing. — after Snow, Fisher, and every growth team since

The hero diagram

Decision matrix.

Two axes: is the null actually true, and did you reject it?

The tools on the bench

Ideas that pay rent.

Hypothesis Test · Statistical inference

H₀ (default / null) · t-statistic = (observed − expected) / SE · p-value · decision threshold

If |t| > 2 (roughly), reject H₀. Otherwise do not.

A/B Testing (RCT) · Causal inference standard

random assignment · treatment vs control · difference in means is the causal effect

Randomisation is the only antidote to confounding.

Type I vs Type II Error · Decision theory

Type I: false positive (α) · Type II: false negative (β)

Every threshold trades one off against the other. Pick deliberately.

How to apply

Running an experiment you will actually trust.

State H₀ in one sentence. "The new button does not change click-through rate."
Commit to sample size before starting. Peeking at results early destroys validity.
Randomise the assignment. Not by date. Not by region. By coin-flip.
Report effect size alongside p-value. Statistical significance is not business significance.

Key reading · Uber Engineering + Snow (1854)

The power of randomised experiments.

Snow's Broad Street pump analysis was the first use of quasi-random assignment to prove causation in public health. Every modern A/B testing stack is a digital re-run of the same logic.

Randomise or you will never know.

← m.01 The Range You Can Defend ··· m.03 Regression & Correlation →