Github repository here

Introduction

Goal is to explain how bayesian approach in AB testing helps to serve a better statement by calculating the probability of the lift percentage after conducting an AB Test

Experiment Definition

In an AB Test, we split our users evenly into:

    Control They get the old webpage
    Treatment: They get the new webpage

Metric we want to track: We have 3 weeks of logged exposure/conversion data. Let's define these terms:

  • Exposure: A user is bucketed as control or treatment and sees their corresponding page for the first time in the experiment duration
  • Conversion: An exposed user makes a purchase within 7 days of being first exposed
  • Questions you should ask when setting up a test:

  • How do you think the experiment will fair?
  • Do we have actionable next steps laid out?
  • Github repository here

    Data Collection

    Let's use some A/B testing data: Kaggle Datasett

    Each row is logged when user is exposed to a webpage.
  • timestamp: time the user is first exposed
  • group: bucket
  • landing_page: Which page are they seeing
  • converted: Initialized to 0. Changes to 1 if the user makes a purchase within 7 days of first exposure
  • Frequentist Approach vs Bayesian Comparison

    Beta Distribution explained

    The Beta Distribution can be best explained with this article: Beta Distribution Blog The same logic has been used in the code while updating prior parameters with experiment conversion rates

    Advantages of Bayesian over Frequentist

  • Results are more interpretable than the ones we got from the frequentist approach
  • We can interpret results at any point during the experiment. Don't need to wait for an arbitrary "statsig"