A/B Testing: The Worst Questions & Recurring Comments
To get home ??, I have the choice between the city center route ??and the one that goes around it???. Whatever my choice, when I arrive, I wonder if the other would have been faster. I will never know the answer to this question in real life.
In digital life, the A/B Test allows you to answer this question, but, even better, to know the time difference.?? This is the superpower of A/B Testing.
Some have difficulties understanding this superpower, and depending on the context, the same questions come up. These are annoying because they demonstrate a lack of understanding of the operating principle of the A/B Tests and it's difficult to answer them without being unpleasant or rude.
Here are the worst recurring questions:
When several tests give positive results in a row ??????:
First of all, there is no such thing as trend in testing like there is no trend in roulette at the casino. If we have several successful tests in a row: either we are lucky, or we only test obvious solutions to obvious problems. We don't take enough risks!
Regarding the duration of the tests : it's essential the data are significant to draw conclusions. Shortening the test only makes it inconclusive and useless.
The performance gain is determined by the adequacy of a solution to a problem. If we want gains of 25%, we must find problems for which a gain of 25% is possible. It's the size of the problem that makes the gain, not the test.
Finally, solutions that don't work in January won't work in July. Re-testing is a waste of time and money.
When a sensitive test gave a positive result on a first market:
“Why testing it in other countries, it’s a waste of time.”
When implementing a solution with significant stakes, we first test it on a small customer cohort. This limits the risk. If the results are positive, we extend the test to other channels, other countries ?? in which the market conditions are different in order to validate the hypotheses.
It's precisely because the market conditions in another country are different (product mix, methods of payment, regulations, prices) that it's important to test again.
When an important test gave a negative result ??:
“The set-up is probably incorrect, the data aren't captured properly.”
A natural reaction to bad news is denial. When the results aren't those expected, we reject data and the conclusion by blaming the set-up or the reporting system.
Note, when a test shows large gains, no one question whether the data are correct or not.
领英推荐
Of course, making sure the data are correctly collected before launching a test is a given. It's also good practice to check right after launch that data arrive as expected in the reporting tool.
When the new experience in production delivers less gain than in the test phase ??
"You poorly implemented the new experience in production! It only delivers +5% while the test shown +10% gain."
This common phenomenon is usually due to testing conditions. When we run a test, we send some of the customers to an experience designed just for testing. As it's a temporary solution, it's not as robust as the current solution. To avoid potential bad user experiences, we do not offer the test to customers considered at risk. For example, we will exclude users of Macs, old web browsers and mobile phones. We end up testing on a subset of users and the gains will vary upwards or downwards. As no one ever complains about winnings higher than expected, there remains this impression of ever lower gains.
Another reason, with a less significant impact, is due to the novelty effect (in the sense of "change"): sometimes tests show a very positive or a very negative start. Customers who visited the website before the test launched, return and receive a completely different experience. Their reaction will be different from those who only saw one experience. When the new solution is implemented, there is no longer any novelty effect ("change") even if the experience is new.
When several tests have been negative in a row ??????:
“We need to stop testing what doesn’t work. Why don’t you just test what works?”
Testing only what will give positive results is difficult to achieve unless you take no risks. This remark actually reflects a lack of ambition.
We test for two reasons:
If a test doesn't fit into either of these two categories, then it should not be tested.
Sometimes we know a change will be
Then, we can launch the change (which addresses the problem) as a test in order to measure its impact.
Of course, we wish all the tests would provide positive gains. But, if we must deliver only positive gains, the only way to achieve this is to take no risks.
Tips to remember
#Subscription #RetentionMarketing #Retention #ABTesting