Explained: Hypothesis - Testing to a marketier (Inferential statistics- part II)

A new financial year has started. It's time we look at a hypothetical scenario.

Suppose, as a digital marketer, generating quality leads is one of your media KPIs and Cost Per Quality Leads (CPQL) is the metric to measure your success. You're responsible for generating a healthy CPQL (say below the industry average).

So here you are, presenting the CPQL reports for each subset of the brands to your client. You have done a great job, the CPQLs for each sub-brands have gone down by 15%-20% compared to last year. You can imagine, you're on cloud nine and proud of your achievement. Your client is really happy with the performance. So it's a happy ending then?

Nope, a bit is still left. The client has a third-party auditor who'd validate these numbers.?They'll compare your CPQLs against the market average. They're using a pool of prices paid to generate similar quality leads for selected brands that have similar features to your brands. To save time and money, they calculated the market average on the sample data. However, there's nothing to worry about, they ensured that all the data points are independent and the sample size is large enough to apply CLT.

The auditors too are happy with your performance, as for most of the sub-brands, your CPQL is lower than the market average. However, the results differ for one sub-brand, where you paid a higher price to generate a quality lead compared to the market average. This is for the most expensive product which was launched last year and is under the scanner of your client. The imaginary numbers are:

Your CPQL for sub-brand-X- INR 1500/-

Pool Market Average CPQL for similar brands to brand X - INR 1450/-

So, you delivered costlier CPQL, INR 50 more than your competitors to generate a single quality lead.

The client was ok with the auditor's results, as this was just only for one brand and overall you did a great job. However, you're not happy and with your statistical knowledge, you decided to challenge the status quo. You decided to perform a test.?

So you have a question of interest. Is the market average CPQL of all similar brands less than 1500/-? So we're trying to figure out if the market average CPQL drawn from the entire population (population mean) would be > = or < the CPQL you achieved (1500/-).?

Let's denote the market average CPQL for all similar brands i.e., our parameter of interest by μ.

Let's figure this out Step-by-Step.

Step 1- Forming Null and Alternative Hypothesis

Our Null Hypothesis will be if the market average CPQL drawn from the population data is > or = 1500/-. In other words, our null hypothesis is if we're delivering equally or even better CPQL than the market average. So this is a one-tail test.

H0 : μ >=1500

Here goes the Alternative:

H1 : μ < 1500

Alternative states that we're less efficient than the market average, as the market average CPQL< our CPQL (1500/-)

Failing to reject the null theory would mean that the results are in our favor, that the market average CPQL is NOT lower than our CPQL.

Before we look at the data we need to make sure we establish how much evidence we require in favor of the null theory. It's time to set the significance level for the test. The standard significance level is 5% and your client agrees to it.

What is the 5% significance level??This means under the null, the results will be so unusual that we would see those results or more extreme, no more than 5% of the time. In other words, the risk level can be tolerated in rejecting the null when it is in fact true.

Step 2- Checking Assumptions, Summarizing Data with Test Statistics?

Here you have a problem. You don't have the access to the sample data points that the auditor used. However, they assured that the samples are random and independent. Also, you're happy to know that the sample size is fairly large, thanks to the Central Limit Theorem, the normality condition is met.

You do know the sample mean CPQL as well, 1450/-. This is also called the Best Estimate. It's lower than your CPQL by 50/-. But is that significantly Low? Because, you know, since this is the sample mean, the numbers might vary with a different set of samples. So is this difference of INR 50 significant? Or is the difference just because of the variability in the sample mean?

Ok, it's all boiled down to Variability or Spread or Standard Deviation. Now, you don't have the Standard Deviation for the population data. But you can ask for the Standard Deviation of the sample data from the auditor.?Say it's 180. Now you can calculate Standard Error(SE).

Sample Size=25

Estimated Sample Mean=1450

Sample Standard Deviation=180

SE= Sample Standard Deviation/SQRT(Sample Size)=36

Let's calculate the Test Statistics(T):

T=(Best Estimate-Hypothesize Mean) / Standard Error

T=--1.39

What does it mean? It says how our sample means compared to our hypothesized mean in terms of the estimated standard error. Our Sample Mean is lower than 1500/- but it's only -1.39 Standard Error away. Is that distance significant? To decide, let's convert the Test Statistics into Probability values.

Step 3- How unusual are the results? Determine p-value

What is the p-value? The probability of seeing a test statistic like our result, -1.39 or something more extreme, assuming the null hypothesis is true. How likely is it to get -1.39 or more extreme values? Well, to calculate that we need to know which distribution our test statistic follows. We don't know the population variance, hence we're following t-distribution.

Ok. We calculated the p-value with the help of Python Scipy and it's 0.09. This means, there's a 9% probability of obtaining test statistics equal to or more extreme than our result under the null theory. So, it's not that unusual under the null theory, but quite likely.?

Step 4- Making Decisions with Sufficient Statistical Evidence

Now let's look at the fun rules of p-value to make a decision:

- If p-value > significance level, we don't have enough evidence to reject the null.?

- If p-value < significance level, we will reject the null

No alt text provided for this image

This table above (source:study.com) represents each of the three possible null & alternative hypotheses that can be tested for in an independent T-test along with the rejection regions.

In our case, since the p-value is higher than the significance level (0.09>0.05), hence we fail to reject the null hypothesis.?

BINGO! There's insufficient evidence to conclude in favor of the alternative hypothesis, i.e., the market average CPQL is lower than the CPQL we achieved, 1500/-. In other words, we fail to reject the null theory that states the market average CPQL = or > the CPQL you achieved (1500/-).

Finally, you can construct a confidence interval at a 5% significance level to further support your statement.

One thing to note, the decision might vary with changes in the significance level. If we increase our level of significance from 5% to 10%, we'll conclude in favor of the alternative hypothesis as then the p-value(0.9) < significance level(0.10). However, since you both agreed to a 5% significance level, hence you're going to stick with your results at a 5% significance level.?

Almost felt like a lawyer fighting her case in the high court? ??

Where would you use hypothesis testing? to get actionable insights from:

-?The change in the CTA button or ad copy

-?Revamping pages like Product/Cart/Checkout, etc,?of your e-commerce site

-?Email vs In-App notification??

?- and so on..............Well in this world of A/B testing, almost everywhere!

#data #dataanalytics #statistics #python #digital #abtesting

要查看或添加评论,请登录

Chandralekha Ghosh的更多文章

社区洞察

其他会员也浏览了