Archive for the ‘Excel’ Category

How to create a random A/B/n split in Excel

October 9, 2013 Leave a comment

I took over a test earlier this summer and one of the tasks I had to do was to validate the A/B/C split. I had to evaluate if my counterpart that proposed the original split had developed a random split with little to no bias between the data sets. In this post I’ll provide you with some quick and easy formulas you can use to automate the a/b/n split process as well as how to create a table that will help you know if there is bias in the split.


The RandBetween function in excel allows you to generate a random number between two variables. If you are doing a simple A/B split, you can set the fuction to =randbetween(0,1) and Excel will automatically divide half of of the rows of data to 0 and the other half to 1. If you need a three way split you can use the function =randbetween(0,2) down all rows of data you want split and you’ll see a 33% between 0, 1, and 2.

A few things to note with this function:

  1. The random split will automatically reset everytime you press enter within the workbook or make any updates. If you find a split you like, copy and paste special the variables into a new column so you don’t loose your split.
  2. The random element of the split does not guarantee that your split is unbiased. You still need to validate the split by creating a report that compares A/B/n to make sure that the variance is within your tolerance range. I’ll share a quick and simple example of how I did this so that I could hit enter until I found a split that appeared to be non-biased so I could do further statistical analysis.
  3. The randbetween function doesn’t allow you to develop a stratified split. If you need to develop a stratified split then you’ll have to embed the randbetween within an IF function and you’ll want to validate that both the split is random and valid and that the stratified variables are also evenly split between the test and control.
    What do I mean by stratified split? Check out this great article about Stratified Random Sampling by
    1. Evaluating the A/B/n Split for bias

      Everytime that you create a split you want to evaluate the split to see if there is any bias in the data sets. I discovered that I could use SUMIFS and COUNTIFs functions in XLS to create a table that would let me look at the output of the split to determine if the random sample appeared to be unbiased so I could then do my statistical analysis on the data set.

      Every time you enter a value into any cell on the tab that is calculating the SumIFs/CountIFs excel will randomly reassign 0,1 to the rows, giving a new potential split. Continue entering a value into a cell until the split is within your tolerance range. I try to keep my primary KPI variables of the test group within 0-5% variance to the control. A quick and easy way to visually tell if you’re split is less than 5% is to set conditional formatting for the cell background to “Green”,”Yellow”, “Red” based on the variance of test to control. Here is an example of what the A/B split table might look like and the formulas you would use to create it.

      Example of the table with the CountIF/SumIF formulas shown and conditional formatting set:

      Make sure you know what your test KPIs are because it can be difficult to find a random split where all metrics are within 0-5% variance between the groups. In this example I wanted to run a test to improve my Revenue and TXN volume, so the metrics I care the most about are Revenue, Transaction, and Return on Ad Spend. My goal is to find a split where these metrics are as close as possible, and then control paste values the split before I accidentally reset it by hitting the wrong button/cell. The screenshots below are a few examples of me hitting enter multiple times to go from an okay split, to a bad split, to a better split, to the best split.

      Split 1: Revenue variance is to large. I might copy and paste values for this split into a separate column to the far right just in case after 20-30 splits I can’t get a better combination, but since Revenue is one of my evaluation KPIs for the success of my test I should keep on looking.

      Split 3:Bad. Notice the large variances between Test & Control metrics

      Split 8: Still too much variance in Transactions

      Split 16: Best Split! Copy and Paste Values into a new column NOW. Notice in this split the variance in all of the core KPIs I’m going to use for evaluating my test performance appear to be within 5% of each other. Use this split for the statistical analysis.

      I’ve mentioned multiple times to do a separate statistical analysis post split. I’m not a stats guru so I work with a much smarter individual and team than myself, but they run multiple statistical significance calculations to determine if the split is valid. There are times when that amazing split that we saw at the end when analyzed to the mean and STD DEV are actually not valid the means are too far apart and so we have to resplit again until the means are very close together.

      I know that this won’t answer the question, “How do I determine if my split is valid?” or “How do I do the statistical calculations?” That’s not my area of expertise so I’ll try to find some links to post to help us all learn!

Categories: Analytics, Best Practices, Excel Tags: ,

Using SumIfs and CountIfs in Excel

September 5, 2013 1 comment

In the last two quarters I’ve been increasing the amount of functions I use on an on-going basis. The functions that I’ve fallen in love with recently that I’ve been using as much as if not more than Pivot Tables with Vlookups are “SumIF” and “CountIF” and their respectable “s” versions. Why do I love them? Because they can be just as effective for creating a high level report that would have previously required me to use a pivot table followed by a vlookup.

What is SumIf and CountIf

First off what are they? They are exactly what the sound like… It’s a either a count or sum of an array based on the array meeting some set of criteria. For example, I might want to know how many keywords I have in my Google Adwords account? (CountIF) or How many clicks all of my Mobile Targeted campaigns have received (SumIF.)

What is SumIFs and CountIFs

The plural form means that the sum or count only happens within the rows of the array when it meets multiple set of criteria. For example as a marketing manager in paid search I might want to know the count keywords on Google that are active (CountIFs – Counting Keywords based on two variables: SearchEngine=Google & Status=Active).
I might also want to know the total cost of all keywords on Google with Mobile Targeting (SumIFs)

How to write the SumIF OR CountIF Function

SumIFs and CountIFs are very easy formulas to write, if you have issues remembering the order of the criteria vs sum array look at the hint in the formula bar. Below is a sample set of data along with a CountIF function in action.
=CountIF(Range, Criteria) =CountIF(What do you want counted, What is the variable that should be counted)

=SumIF(Range, Critiera, [Sum Range]) =SumIf(Where to look for the variable, What is the variable, [Set of data if criteria are met that you want Summed]

Example of writing a CountIf Function in XLS

Taking it one step further as I mentioned above you when explaining the SumIFs and CountIFs variables you can create a table that looks up a preset data format and have it return variables based on multiple sets of criteria. For me this is very helpful with weekly reporting because I get my performance data in the same format each week. I Can essentially create a preformatted table where my data will be delivered and have the COUNTIFS/SUMIFS provide my high level summary of information on a “Performance Dashboard” that then would get distributed to my manager or leadership team.

Here is an example of using CountIFs and SUMIFs in an excel workbook:

I hope that you find these functions as useful as I find them and that you can save a little bit of time in your work day or work week by adding these simple formulas into your arsenal of XLS awesomeness!

Categories: Excel

Understanding Encoded URLs

The last post I wrote was explaining how to encode or unencode a URL using a simple replace function in Excel, and in this post I want to explain “why” URLs need to be encoded in the first place. I also want to say that I’m not the most technical marketer out there — so this is my understanding behind when and why we encode our URLs.

Why do some characters or symbols need to be encoded?

URLs need their characters to be in a specific format, ASCII characters, so that browsers can correctly interpret and parse the data. There is a set of characters that have special meaning, ? = & # % =, these characters need to encoded if they are being used for any purpose other than what they are supposed to be used for.

I’ve been dealing with double and triple encoding of URLs because my marketing campaigns use one for attribution tracking across marketing channels, one for the historical bid optimization platform, and a third redirect from the new bid optimization platform that we are migrating toward. In most cases you shouldn’t use three back to back redirects because it leaves a lot of room for breaking between the initial redirect and the destination page loading. I’m having to do multiple levels of encoding so I don’t loose the tracking codes that are being passed and appended to the destination URL.

The examples I’m using below in this post are combinations of the Adlucent, Kenshoo and Mediaplex redirect URLs.

What does a redirect URL look like?

An unencoded URL with two redirects, one for the paid search marketing agency (in this example: ThirdPartyCompany) and one for an attribution vendor/third party software to cookie all marketing traffic (in this example: Other.TrackingPixel) might look something like this:{creative}&retailer=companyname&url=[ProdCategory]-[MatchType]-{creative}

In the URL above you would want to encode all characters and symbols starting with the second ? within the URL. The encoding would look like this:{creative}&retailer=companyname&url=

If I were to triple encode a URL, i.e. add a third redirect, it would still need to happen starting at the second ? within the URL set, but it also means that the % symbol for the first round of encoded URLs would be encoded to %25. So a double encoded ampersand (&) would be a 4 digit numerical string represented as %2524. %24 is the ampersand, but then you’d have to encode the % which is %25, add it together and you get %2425.

So have I completely lost you yet? If yes, my apologies. Please ask questions!

Below are a few good references for the purposes of characters in a URL string and the common ASCII encoded values for core characters/symbols.

So what is the Purpose of a specific character in the URL?

Character Purpose in URL Encoding

: Separate protocol (http) from address %3B
/ Separate domain and directories %2F
# Separate anchors %23
? Separate query string %3F
& Separate query elements %24
@ Separate username and password from domain %40
% Indicates an encoded character %25
+ Indicates a space %2B
~ Indicates a user directory %&E

ASCII Character URL-encoding

space %20
! %21
” %22
# %23
$ %24
% %25
& %26
‘ %27
( %28
) %29
* %2A
+ %2B
, %2C
– %2D
. %2E
/ %2F
: %3A
; %3B
? %3F
@ %40
[ %5B
\ %5C
] %5D
^ %5E
_ %5F
` %60
{ %7B
} %7D

Creating an Excel Formula to Encode or Unencode URLS

March 27, 2013 8 comments

Today I’ve been playing with uncoding paid search URLs that contain multiple redirects from vendor partners for tracking purposes. Since my xls document contains more than 500,000+ rows of data using the find and replace functions is cumbersome and caused excel to freeze. It was time for me to use functions within excel to code what I needed to do with find and replace.

Using the Substitute Function to encode or unencode a URL in Excel

To get the project accomplished on time I embedded multiple substitute functions within each other to essentially replicate find a replace.

The substitute function looks for specified text/value within a cell and will replace with another specified text/value. You can have it do a specific number of times or for all instances within a cell.

=substitute(text, old_text, new_text, [number_of_instances])

=substitute(CELL_with_URL, “what you want removed”, “what you want to replace it with”)

Notice that I didn’t include the # of instances…that means do it for every instance of “what you want removed” within the cell. If you only wanted to remove the first instance of something in the URL do it like this:

=substitute(CELL_with_URL, “what you want removed”, “what you want to replace it with”,1)

In case you ever find yourself in the same boat and need to either encode a URL or decode a URL you can use the following snippets of code within excel.
Please note that you most likely will need to modify this so that it fits whatever characters you are attempting to encode or unencode. 

Excel Formula to Unencode a URL

Take this formula and place it into the cell to the right of whatever needs to be unencoded: 



This will only unencode the following characters: space {}[]&%=

 If there are additional characters that are encoded then they’ll need to be added to the formula. To add additional characters you’ll want to include an additional substitute( to the left of the equals sign and at the end of the string add the additional data that needs to be unencoded. Example to include a $ symbol you would add ,”%24″,”$”  before the final closing parenthesis.

Excel Formula to Encode a URL

 Likewise to encode a URL: Go to the cell to the right of the column that needs to be encoded and use the following:

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(CELL_WITH_URL, “%”,”%25″),” “,”%20″),”?”, “%3F”),”&”,”%26″),”=”,”%3D”),”{“,”%7B”),”}”,”%7D”),”[“,”%5B”),”]”,”%5D”)