The Art of A/B Testing Review

The Art of A/B Testing Review

Moving into the third week of CXL’s Growth Marketing minidegree program, we got a deep dive into the Art of A/B testing.

This was clearly not a week for the faint-hearted. Quite challenging for those with no prior experience in statistics, or not huge fans of numbers and calculations. This is Growth Marketing though and A/B testing is integral part of its very essence, so can't be overlooked.

From a personal standpoint, thanks to my experience with A/B testing I was capable of grasping the how-to of A/B testing. However, the real challenge came as more complex theories and advanced practices came into place. Remember that the name of the course was A/B testing Mastery, hence the instructor made sure that either we turn into A/B masters or we better have to stay home.

Speaking about the instructor, Tom Wesseling is a definetely a true A/B genius. By spreading the course into modules that follow a logical and practical order, it allows students to easily follow his lead.

Starting with what is A/B testing, on the history of it all, he moves students from 1995 to today, with important dates that marked the development of this field. There was a time when marketing professionals used logfile analysis, comparing weeks to find differences in sales and customer behaviour.

Then in 2000 they got introduced to redirect scripts, moving on to 2003 when professionals started to test using cookies. 2006 marked the genesis of Google Optimizer and the rest is history. Today, as you may know, it's all about AI, personalization and segmentation when A/B testing.

Stepping away from historical facts, the modules that followed were about A/B planning, execution and results with additional bonus information at the very end of the course. Thanks to its great presentation skills, hands-on examples and tools used, Mr. Wesseling used different techniques to support its online audience.

When to you use A/B Testing?

Kicking off, Mr Wesseling highlighted out the importance of putting your ideas to the test. Making clear that there three main reasons you will need to A/B test:

  • To Deploy - Real deployment - Looking for no negative signals
  • To Research - Looking for signals on impact to understand what needs to be optimised
  • To Optimise- Lean deployment - Looking for wins


Do you have enough data to conduct A/B tests?

When can you run and how many experiments?

Data is King/Queen.

If you don't have enough data, it doesn't mean that you are unable to testing, but your results are not going to be the most optimal.

This is why the ROAR model is essential at this point.

Will the results of your experiment produce the wanted impact to your business?

Is it worth the risk optimising and putting in place new optimisation practises?

ROAR model

Risk - Optimisation - Automation - Rethink

Therefore, companies need to identify the right data, specify on the KPIs they are going to test, monitor the testing process and optimise accordingly. Goals that lead to an increased conversion rate/revenue are among the most used for A/B testing. While clicks and sessions are mostly considered of secondary importance and not directly linked to lifetime value.

  • Define using short term metrics that predict long term value
  • Think about customer lifetime value, not immediate revenue 
  • Look for success indicators now vanity metrics

Statistical Power

Statistical power is the likelihood that an experiment will detect an effect when there is an effect to be detected. So if you've created something good as a challenge something that in reality makes an effect.

Depends on: Sample size, Effect Size, Significance level

When conducting experiments, statistical power and significance are rules that can't be skipped.

Always Remember!

A statistical significance above 90% to be safe with statistical power over 80%.

By making the wrong calculations you run the risks of infected results. With infected, I mean false-positives and false negatives.

Think of these terms as the two main players of the A/B game. When testing, there are instances that you might believe that one test version is the winner, wither other the loser.

However, if you haven't set the right parameters you might get tricked. Therefore, at times it is wise to stick with the control (original) and not the challenger (test version) until you make sure that the second can actually overthrow the first.

But how can you get the right hypothesis before A/B testing?

The 6V Research Model

Values, Versus, View, Voice, Verified, Validated

A/B professionals think and use these principles as triggers.

These triggers, during the research and observation process, help them form questions that progressively turn into effective hypotheses. These hypotheses should be testable in order to bring about prediction about the outcome. Of course, then the stage of development comes. After the end of the test, you can work on the outcomes in order to come up with conclusions and attempt generalizations and build theories.

VALUES

  • WHATS YOUR MISSION?
  • WHATS YOUR STRATEGY?
  • SHORT-LONG TERM GOALS?

VERSUS

COMPETITORS ANALYSIS - WHO ARE MY COMPETITORS? 

VIEW

  • Website Analytics - Where do users start? Where do they come from?
  • Landing pages - Where they enter the website? 
  • Traffic sources?
  • Customer journeys?
  • Measure scroll - Heatmaps 

VOICE

  • Talk to customer service
  • Look at social media feedback
  • Surveys - ask for feedback online 
  • Use Groups - Interviews - User Research
  • Check chat logs

VERIFIED

  • What do we know from scientific literature?
  • About the decision-making process?
  • About the product sold?

VALIDATED

  • Insights of previous tests - What have you tested already?
  • Any validated hypotheses?

Once the above questions are answered with additional activities performed, it is time to proceed with the execution. Set the hypothesis first, use prioritisation methods to select the best hypothesis and move on with testing. An A/B testing hypothesis formula can be something like the following:

If I make an X change on my website, this will affect user with Y, because of Z.

It is a cause and effect relationship. There is no point to A/B Testing and optimisation if what you are testing doesn't create a positive impact. This is why every hypothesis should be based on data and a chosen psychological approach that justifies your choice.

How to prioritise your hypothesis?

PIE (Potential, Importance, Ease) and ICE (Impact, Confidence Effort) models are the norms but their evolution dubbed PIPE is an even more updated form which includes statistical power. Then you just need to find the right location to test the hypothesis that won the PIPE competition.

Once you have prioritised and selected your best hypotheses using the frameworks above, before starting testing, remember the statistical power. Guarantee that there won’t be any type-M errors.* Proceed by setting your experiment roadmap and schedule your experiments. Also be sure to have one strong challenge with enough changed to justify the data, keep your budget in check and always mirror the design with the hypothesis.

*A Type M error is an error of magnitude. I make a Type M error by claiming with confidence that theta is small in magnitude when it is in fact large, or by claiming with confidence that theta is large in magnitude when it is in fact small.

Configuration and Going Live

Once you have designed your experiment, move on to configure it in the testing tool, avoiding Simpson’s paradox by making any changes in the distribution mid-test. In your analytics tool, you will also monitor the experiment since there may be a need for extra measurements.

Your test should also have a specific length. Most of the time, business cycles are the norm but if it’s not, then be sure to measure full weeks of experimentation and not just days. If you see immediate results and find a winner, don’t decide on stopping the test. You may have a false positive which is not a true winner.

After the test is done, you should work on solidifying your results and presenting your data. Do the usual SRM checks to find if there was a mismatch of the population between the basic segments (devices, browsers, user types). Then, you are free to analyze the user that converted (not the sessions) and avoid sampling. Naturally, you can build custom reports regarding your points of focus.

Always keep in mind that A/B Testing is a long journey filled with wins and failures along the way. That is why you need to practice researching and prioritizing, organize well and implement changes based on the outcomes in the best possible way.

When failing and winning you have learned something new!


要查看或添加评论,请登录

Constantine Dranganas的更多文章

  • Why is design thinking failing us?

    Why is design thinking failing us?

    Before you roll your eyes and think, "Oh great, another lecture on design thinking," hear me out. I’m not here to…

  • The Why in Data

    The Why in Data

    Thank you for joining me again (or for the first time) as I continue to explore the interplay between WHY and HOW in…

  • The Power of WHY

    The Power of WHY

    ??Welcome to WHY before HOW?? If you are reading this, I’m thrilled you have joined me on this weekly newsletter where…

  • Mobile UX: Ultimate guide to understanding app UX Design

    Mobile UX: Ultimate guide to understanding app UX Design

    A great idea alone does not make for a great app. The Mobile User Experience (UX) design of your app is just as…

  • Memorisely UX/UI Bootcamp: An honest review

    Memorisely UX/UI Bootcamp: An honest review

    It all started 12 weeks ago when just like a freshman I started the Memorisely UX/UI Design Bootcamp. This wasn't my…

  • Why am I learning UX/UI with a class around the world?

    Why am I learning UX/UI with a class around the world?

    Hello reader ?? Wherever and whenever this blog finds you, I'm glad that you are here! I'm here to talk to you about my…

    4 条评论
  • Beyond Basic Attribution - Review

    Beyond Basic Attribution - Review

    Moving on to the 6th week of the growth marketing CXL program, attribution is in the center of the course's attention…

  • Marketing Attribution & Its Role In Growth - Review

    Marketing Attribution & Its Role In Growth - Review

    Businesses have a lot of marketing data but are not getting enough value from it. This is when specialists in…

  • Learn how to tame your data - Review

    Learn how to tame your data - Review

    We live and breath data! Inevitably! There are about 2.5 quintillion bytes of data created each day at our current pace.

  • Power to the Users: User-Centric Marketing Review

    Power to the Users: User-Centric Marketing Review

    The second week of the CXL minidegree in Growth Marketing was all about users. Paul Boag, User Experience Consultant…

社区洞察

其他会员也浏览了