Home > Analytics, Best Practices, Excel > How to create a random A/B/n split in Excel

How to create a random A/B/n split in Excel

I took over a test earlier this summer and one of the tasks I had to do was to validate the A/B/C split. I had to evaluate if my counterpart that proposed the original split had developed a random split with little to no bias between the data sets. In this post I’ll provide you with some quick and easy formulas you can use to automate the a/b/n split process as well as how to create a table that will help you know if there is bias in the split.

RANDBETWEEN Function

The RandBetween function in excel allows you to generate a random number between two variables. If you are doing a simple A/B split, you can set the fuction to =randbetween(0,1) and Excel will automatically divide half of of the rows of data to 0 and the other half to 1. If you need a three way split you can use the function =randbetween(0,2) down all rows of data you want split and you’ll see a 33% between 0, 1, and 2.

A few things to note with this function:

  1. The random split will automatically reset everytime you press enter within the workbook or make any updates. If you find a split you like, copy and paste special the variables into a new column so you don’t loose your split.
  2. The random element of the split does not guarantee that your split is unbiased. You still need to validate the split by creating a report that compares A/B/n to make sure that the variance is within your tolerance range. I’ll share a quick and simple example of how I did this so that I could hit enter until I found a split that appeared to be non-biased so I could do further statistical analysis.
  3. The randbetween function doesn’t allow you to develop a stratified split. If you need to develop a stratified split then you’ll have to embed the randbetween within an IF function and you’ll want to validate that both the split is random and valid and that the stratified variables are also evenly split between the test and control.
    What do I mean by stratified split? Check out this great article about Stratified Random Sampling by BetterEvaluation.org
    1. Evaluating the A/B/n Split for bias

      Everytime that you create a split you want to evaluate the split to see if there is any bias in the data sets. I discovered that I could use SUMIFS and COUNTIFs functions in XLS to create a table that would let me look at the output of the split to determine if the random sample appeared to be unbiased so I could then do my statistical analysis on the data set.

      Every time you enter a value into any cell on the tab that is calculating the SumIFs/CountIFs excel will randomly reassign 0,1 to the rows, giving a new potential split. Continue entering a value into a cell until the split is within your tolerance range. I try to keep my primary KPI variables of the test group within 0-5% variance to the control. A quick and easy way to visually tell if you’re split is less than 5% is to set conditional formatting for the cell background to “Green”,”Yellow”, “Red” based on the variance of test to control. Here is an example of what the A/B split table might look like and the formulas you would use to create it.


      Example of the table with the CountIF/SumIF formulas shown and conditional formatting set:
      Best-AB-Split-Formulas


      Make sure you know what your test KPIs are because it can be difficult to find a random split where all metrics are within 0-5% variance between the groups. In this example I wanted to run a test to improve my Revenue and TXN volume, so the metrics I care the most about are Revenue, Transaction, and Return on Ad Spend. My goal is to find a split where these metrics are as close as possible, and then control paste values the split before I accidentally reset it by hitting the wrong button/cell. The screenshots below are a few examples of me hitting enter multiple times to go from an okay split, to a bad split, to a better split, to the best split.

      Split 1: Revenue variance is to large. I might copy and paste values for this split into a separate column to the far right just in case after 20-30 splits I can’t get a better combination, but since Revenue is one of my evaluation KPIs for the success of my test I should keep on looking.
      Good-AB-Split

      Split 3:Bad. Notice the large variances between Test & Control metrics
      Bad-AB-Split


      Split 8: Still too much variance in Transactions
      GettingCloser-AB-Split

      Split 16: Best Split! Copy and Paste Values into a new column NOW. Notice in this split the variance in all of the core KPIs I’m going to use for evaluating my test performance appear to be within 5% of each other. Use this split for the statistical analysis.
      Best-AB-Split


      I’ve mentioned multiple times to do a separate statistical analysis post split. I’m not a stats guru so I work with a much smarter individual and team than myself, but they run multiple statistical significance calculations to determine if the split is valid. There are times when that amazing split that we saw at the end when analyzed to the mean and STD DEV are actually not valid the means are too far apart and so we have to resplit again until the means are very close together.


      I know that this won’t answer the question, “How do I determine if my split is valid?” or “How do I do the statistical calculations?” That’s not my area of expertise so I’ll try to find some links to post to help us all learn!

      Advertisements
Categories: Analytics, Best Practices, Excel Tags: ,
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: