登录查看更多内容

Basics and Example of A/B Test

Ankur Bhargava

AI Specialist | LLM | Gen AI

发布日期: 2023年2月21日

+ 关注

In this article, we will be covering the basics of A/B testing.

Before understanding the basics and various aspects of A/B testing, we need to know why A/B testing is required.

A/B testing is required when we want to improve a particular aspect of a product. For example, changing the colour of the submit button on a form or evaluating a three-page form vs a one-page form having all the information. The current version is control and newer version is called treatment.

Consider a website planning to add a coupon code option and test out if adding a coupon code option leads to reduction in revenue.

Before we design an experiment to test our hypothesis, let's look at the user journey for this typical ecommerce website.

It is important to finalize the metric which would be considered for evaluating the success of the experiment. An appropriate metric, also known as Overall Evaluation Criteria(OEC), for the experiment is revenue per user.

Step 1 - Users for our experiment:

We have three following choices:-

a) All the users who visited the site

b) Users who completed the purchase process

c) Users who start the purchase process

Option C seems to be the best choice as it removes noise present in option A and option B impacts the total revenue generated, not the percentage of users who completes the purchase.

Step 2 - Hypothesis:??

Adding a coupon option to check out page will decrease the revenue-per-user for users who start the checkout process

H0: There is no difference in the revenue-per-user

Ha: Revenue-per-user is lower in the treatment group(having coupon code option)

Step 3 - Level of Significance (alpha):

It is the probability of rejecting the null hypothesis when it is true. It is also known as Type I Error.

For this experiment, we will be taking 5%, generally taken. Could have been changed to 10% if less conservative or 1% if more conservative.

Step 4 - Power of test:

领英推荐

WalletConnect & Smart Contract

Dr. Gordon Jones 1 年前

The Sana Scoop- November

Sana Commerce 1 年前

Optimizing Magento Commerce for B2B Transactions –…

Bytes Technolab Inc. 1 年前

It is the probability of detecting a meaningful difference between the variants when it really exists. It is generally taken between 80-90%, and we will take 80% for our experiment. The higher the power, the more will be the sample size required.

Power of test is also equal to 1 - Type II error.

Step 5 - Lift/ Practical Significance:

It is the minimum lift business wants to see, as it will make the investment on the experiment worth it.

For our experiment, we will take 1%.

Step 6 - Sample Size: To calculate the sample size we need to know the baseline conversion, conversion rate of existing system (control), is needed. We will consider it to be 20% for our case.

Using the calculator, we get sample size ~ 628K

Step 7 - Number of days test to be run

If the average number of customers initiating the checkout process, then we need around 15 days for the experiment to run.

Points to be noted while considering the number of days:

a) Day Of Week Effect - People behave differently on weekdays and weekends, so it's advisable to consider one whole week at least

b) Seasonality - Don't run experiment if there is an event like X'mas or Diwali as user behave differently during these time

c) Primacy and Novelty Effect - Some new features will instantly be used by users as they want to use it and some features will require to get used to it. In the former case, number of users will stabilize moving forward and in the latter case number of users will pick up.

Step 8 - Run the test and analyze the result

We can run t-test and compare the value of p with alpha.

P-value is the probability of getting the values as extreme as in the treatment group, given that the null hypothesis is true.

If p is less than alpha, reject null hypothesis and p is more than alpha, fail to reject null hypothesis.

For our case, results are below :

P-value is less than alpha, so we reject the null hypothesis.

It means adding coupon code option on checkout page is not a good idea.

This post is created using the material from the Book - Trustworthy Online Controlled Experiments by Ron Kohavi

Please feel to correct me if there are any issues.

Cover Picture is taken from - https://www.techtarget.com/searchbusinessanalytics/definition/A-B-testing

Sumit Kumar

2 年

Good one Ankur ! Looking forward for your post on more topics ??

1 次回应

Ron Kohavi

Vice President and Technical Fellow | Data Science, Engineering | AI, Machine Learning, Controlled Experiments | Ex-Airbnb, Ex-Microsoft, Ex-Amazon

2 年

Thanks, Ankur Bhargava. If you or others are interested in an interactive Zoom class I teach, see https://bit.ly/ABClassRKLI

1 次回应

查看更多评论

要查看或添加评论，请登录

Ankur Bhargava的更多文章

?? Decoding Tokenization: The Building Block of Large Language Models (LLMs) ??

2025年1月14日

?? Decoding Tokenization: The Building Block of Large Language Models (LLMs) ??

Today, let’s dive into one of the foundational aspects of LLMs: Tokenization. Imagine taking a vast, complex puzzle…
Empowering LLMs with Tools: The Agentic Path to Smarter AI

2025年1月8日

Empowering LLMs with Tools: The Agentic Path to Smarter AI

The true potential of Large Language Models (LLMs) lies not just in their ability to process language but in how they…
Large Language Model Embeddings Fundamentals

2024年11月5日

Large Language Model Embeddings Fundamentals

Imagine an intricate web, woven from threads of words and meaning, stretching infinitely across a hidden landscape…

1 条评论
Critical Pain Points in Retrieval Augmented Generation (RAG)

2024年3月7日

Critical Pain Points in Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) stands as a pinnacle in harnessing the power of Large Language Models (LLMs) to…

2 条评论
ROUGE and BLEU Score

2023年7月12日

ROUGE and BLEU Score

Let's dive into the world of evaluating text generated from Large Language Models (LLMs) and explore the metrics that…

1 条评论
Results to Decision - A/B Test

2023年3月17日

Results to Decision - A/B Test

Few Days back, I wrote an article on how to perform an A/B testing. Once we have done our experiment, now it is the…
Training Data

2022年2月18日

Training Data

In the chapter 3, Training Data, of the book Designing Machine Learning Systems, author Chip Huyen has talked about how…
Basics of Data Engineering

2022年2月15日

Basics of Data Engineering

In the chapter 2, Data Engineering Fundamentals, of the book Designing Machine Learning Systems, author Chip Huyen has…

1 条评论
Designing Machine Learning Systems

2022年2月11日

Designing Machine Learning Systems

Designing Machine Leaning Systems is an amazing and insightful book written by Chip Huyen. It's a wonderful book if…

6 条评论

See all articles

Basics and Example of A/B Test

Ankur Bhargava

AI Specialist | LLM | Gen AI

领英推荐

Ankur Bhargava的更多文章

社区洞察

其他会员也浏览了

Product Pricing Queries with Deacon's Innovative SMS Solution!

Headless Loyalty Explained, For Non-Technical Business Leaders

Why 65% of Buyers Trust B2B Portals for Secure Cross-Border Deals

The Comparison Between B2B, B2C, D2C, C2C, and Social E- Commerce

Small Businesses Loyalty Program Software Market Likely to Boost Future Growth by 2032: Apex Loyalty, CandyBar, SumUp

Voice Commerce for B2B: Are You Ready for the Next Digital Revolution?

Two launches Instalments with 3-24 month payment terms for B2B

White label payment gateway vs building your own payment gateway

Step-by-Step Guide for B2B Marketers to Convert Customers to Adopt B2B eCommerce or Self-Ordering Apps

How Automation and Integration Drive Sales and Reduce Overdue Invoices in B2C and B2B BNPL.

领英推荐

Ankur Bhargava的更多文章

?? Decoding Tokenization: The Building Block of Large Language Models (LLMs) ??

Empowering LLMs with Tools: The Agentic Path to Smarter AI

Large Language Model Embeddings Fundamentals

Critical Pain Points in Retrieval Augmented Generation (RAG)

ROUGE and BLEU Score

Results to Decision - A/B Test

Training Data

Basics of Data Engineering

Designing Machine Learning Systems

社区洞察

其他会员也浏览了

Product Pricing Queries with Deacon's Innovative SMS Solution!

Headless Loyalty Explained, For Non-Technical Business Leaders

Why 65% of Buyers Trust B2B Portals for Secure Cross-Border Deals

The Comparison Between B2B, B2C, D2C, C2C, and Social E- Commerce

Small Businesses Loyalty Program Software Market Likely to Boost Future Growth by 2032: Apex Loyalty, CandyBar, SumUp

Voice Commerce for B2B: Are You Ready for the Next Digital Revolution?

Two launches Instalments with 3-24 month payment terms for B2B

White label payment gateway vs building your own payment gateway

Step-by-Step Guide for B2B Marketers to Convert Customers to Adopt B2B eCommerce or Self-Ordering Apps

How Automation and Integration Drive Sales and Reduce Overdue Invoices in B2C and B2B BNPL.