A/B test gotchas¶

Correlation does not imply causation¶

Even your low p-value score on well-designed experiment does not imply causation!
- It could still be random chance
- Other factors could be at play
- It's your duty to ensure business owners understand this

Changes to a website will catch the attention of previous users who are used to the way it used to be
- They might click on something simply because it's new
- But this attention won't last forever
Good idea to re-run experiments much later and validate their impact
- Often the "old" website will outperform the new one after a while, simply because it is a change

An experiment run over a short period of time may only be valid for that period of time
- Example: Consumer behaviour near Christmas is very different that other times of year
- An experiment run near christmas may not present behaviour during the rest of the year

Sometimes your random selection of customers for A or B isn't really random
- For example: assignments is based somehow on customer ID
- But customers with low ID's are better customers than ones with high ID's
Run an A/A test periodically to check
Audit your segment assignment algorithms

Are robots (both self-identified and malicious) affecting your experiment?
- Good reason to measure conversion based on something that requires spending real money
More generally, are outliers skewing the result?

Often there are errors in how conversion is attributed to an experiment
Using a widely used A/B test platform can help mitigate that risk
- If your is home-grown, it deserves auditing
Watch for "gray areas"
- Are you counting purchases toward an experiment within some given time frame of exposure to it? Is that time too large?
- Could other changes downstream from the change you ŗe measuring affect your results?
- Are you running multiple experiments at once?