Importance of Bayesian Approach in AB Testing

Github repository here

Introduction

Goal is to explain how bayesian approach in AB testing helps to serve a better statement by calculating the probability of the lift percentage after conducting an AB Test

Experiment Definition

In an AB Test, we split our users evenly into:

Control

Treatment

Metric we want to track: We have 3 weeks of logged exposure/conversion data. Let's define these terms:

Exposure: A user is bucketed as control or treatment and sees their corresponding page for the first time in the experiment duration

Conversion: An exposed user makes a purchase within 7 days of being first exposed

Questions you should ask when setting up a test:

How do you think the experiment will fair?

Do we have actionable next steps laid out?

Github repository here

Data Collection

Let's use some A/B testing data: Kaggle Datasett

Each row is logged when user is exposed to a webpage.

timestamp: time the user is first exposed

group: bucket

landing_page: Which page are they seeing

converted: Initialized to 0. Changes to 1 if the user makes a purchase within 7 days of first exposure

Frequentist Approach vs Bayesian Comparison

Beta Distribution explained

The Beta Distribution can be best explained with this article: Beta Distribution Blog The same logic has been used in the code while updating prior parameters with experiment conversion rates

Advantages of Bayesian over Frequentist

Results are more interpretable than the ones we got from the frequentist approach

We can interpret results at any point during the experiment. Don't need to wait for an arbitrary "statsig"