登录查看更多内容

Bayesian A-B Testing

Harry Powell

Data science leader with track record of innovation and value creation

发布日期: 2021年5月23日

… or how we were able to make decisions about price tests in half the time...

The Bank I worked for offers small and medium sized unsecured loans to consumers. These loans are priced as 500 discrete segments according to loan size, loan term, and credit risk. The Bank tries to optimise loan pricing by running A-B tests in various segments, where one part of the segment population is offered the original price and another is offered a test price. If the Bank offers a lower test price the question is whether volume will increase sufficiently to offset the lower margin. Similarly a higher price tests whether the higher margin will offset lower volume. This is very standard price optimisation practice

However the Bank has a problem. Consumer loans is a highly regulated industry and every A-B test must be approved by the regulator. This has two important implications. Firstly the Bank is able to run few price tests compared to an online retailer and this means that we have too little data to estimate the elasticity of demand at each price point. And secondly, because every pricing decision has a large impact on profitability, we need to be sure that it is right. But this meant having to run each price test for a long time, 3 months, in order to get statistically significant results, each test costing hundreds of thousands of pounds in lost revenue even if the test is successful. If we could reduce the time to get a statistically significant result this would have a very positive impact on profitability of the consumer loans business.

We built a hierarchical Bayesian model both to design pricing experiments and to evaluate the results. We were able to use previous experience across many segments to build the prior distribution of the hyperparameters, and we modelled profitability explicitly through the network rather than modelling take up, loan size, loan term and price independently and then combining them together. The modelling exercise turned out to be quite hard and forced us to think very deeply about the assumptions underlying a price test. Through this process we realised that much of the conventional frequentist statistical testing that had been done in the past was not valid even though at first sight it looked reasonable. We estimated the coefficients of the model using standard MCMC libraries which performed OK and we were able to present our results using nice visualisation techniques which our stakeholders found intuitive and very helpful to aid decision making.

One problem we came up against when we were prototyping was how to answer the question “why is your model better than ours”, given that both models were trying to come up with the same answer and in the limit the Bayesian approach converges to the frequentist approach. There are all sorts of rather abstract arguments that we tried, but in the end of the day we simply had to assert that in general it would come to an answer much quicker, although it is hard to say this with certainty. We were however able to use predictive power analysis to estimate the number of samples we would need to collect in order to have some degree of confidence in our recommendation, but again we found it difficult to present this information in a way that was convincing to our commercial stakeholders.

It turns out that, depending upon the prior used, Bayesian A-B testing allowed us to make a decision in around half the time, reducing the length of the test from 3 months to 1? months. This represented an enormous saving for the Bank. On the first test run (which rejected the test price in favour of the original price) we were able to finish the exercise early and this saved the Bank ￡300,000. Of course we still had to explain to some of the stakeholders why this was a great result given that our statistics rejected their idea!

One of the nice things about this methodology is that we have been able to hand over all the models and libraries that we built to the team that was originally doing the optimisation. After making a couple of hires and doing some training, that team is now fully able to implement and reason about a Bayesian approach to price testing. This is a real success because my team was not resourced to maintain and run models in production in the long term, and we relied upon other parts of the organisation to adopt our technologies.

We learned an enormous amount from this project

The most important learning was that we realised how much probability and statistics forget, even people with good recent quantitative degrees. They seem to lose the knowledge virtually as soon as they have taken their final year exams. Even the basic assumptions of statistical tests are often forgotten. For example it is rare for anyone to question whether the underlying distribution assumed in a test is in fact normal. I found this quite concerning and in fact put into place a statistical training program for the analytics teams to remind them of the importance of good methodology when analysing data. All of this can be quite tricky given that questioning an analyst’s statistical skill set can make them defensive. In order to drive to change one needs to on the one hand to be insistent but on the other hand, be kind. Analysts only forget statistical skills because the organisations they work for do not always value them.

The other thing that was interesting about this project is that it is often assumed that statistics don’t matter much in the era of Big Data. The reasoning goes that the law of large numbers and the central limit theorem makes anything other than the t-test unnecessary. However what we have found is that we now have the data and computational power to personalise, and so while we have a huge dataset in aggregate, we now need to make decisions about individuals, on which we may have limited data. Good analytics will always be pushing the limits of what can be inferred from a dataset, and this will very often require a deep understanding of statistical methodologies.

Sam Lloyd

Value from data

3 年

Sarah van der Wal Jack Sanderson-Smith

2 次回应

Ali F.

SVP, AI Research | Capital Markets

3 年

Hi Harry, thanks a lot for sharing!! one question, did this sort of models fall within the model validation scope in the bank you worked for? cheers!

Ali F.

SVP, AI Research | Capital Markets

3 年

Ioannis Bakagiannis

Jose Parre?o Garcia

Data Science Manager at Skyscanner | Medium writer | Problem before Data, Data before ML

3 年

From the outside, setting priors seems easy, but oh my the implications they have on the posteriors. Super cool use case. Thanks for sharing!

查看更多评论

要查看或添加评论，请登录

Harry Powell的更多文章

Graph use-case archetypes

2023年5月5日

Graph use-case archetypes

This note is to try to help you think about use cases for graph data analytics and machine learning in your…

1 条评论
Driving sustainable growth in banks by connecting customer data using a graph database

2023年5月4日

Driving sustainable growth in banks by connecting customer data using a graph database

Growing a banking business requires you to make good decisions at each stage of the value cycle from acquiring new…

8 条评论
What questions should you ask of Chat-GPT based analytics platforms?

2023年3月31日

What questions should you ask of Chat-GPT based analytics platforms?

You know the scenario. You are flooded with sales guys showing you amazing software applications based on some AI…

4 条评论
How to Think Differently

2022年1月22日

How to Think Differently

Last year, I wrote a series of micro-blogs about how to think differently about data science and analytics questions…

5 条评论
A business leader’s short guide to Graph Databases: What they are and why you need them.

2021年12月19日

A business leader’s short guide to Graph Databases: What they are and why you need them.

The word “graph” is very fashionable in IT circles right now, but graphs have actually been around for a while (graph…

11 条评论
A tribute to my InDigital colleagues at JLR

2021年12月17日

A tribute to my InDigital colleagues at JLR

Dear JLR InDigital Colleagues, In my final few days at Jaguar Land Rover I have been thinking quite a lot about what we…

22 条评论
Thinking differently: Avoiding Optimisation 1/2

2021年11月8日

Thinking differently: Avoiding Optimisation 1/2

A lot of business people and analysts are obsessed with optimisation; the best possible business strategy; a single…

10 条评论
Graph Customer Similarity

2021年5月10日

Graph Customer Similarity

… or how we devised a new way to calculate similarity across networks of shops… This project was more of a piece of…

9 条评论
Data Science Case Study 2: NLP Complaint Classification

2021年4月7日

Data Science Case Study 2: NLP Complaint Classification

..

6 条评论
How to be presented to

2021年3月29日

How to be presented to

There is a multitude of courses on how to present to your boss, but have you ever seen one that teaches your boss to be…

8 条评论

See all articles

Bayesian A-B Testing

Harry Powell

Data science leader with track record of innovation and value creation

Harry Powell的更多文章

社区洞察

其他会员也浏览了

What’d I miss? News and Insights that got noticed - Week 39

How AI and Machine Learning are revolutionising the lending landscape

The Role of AI in Modern Loan Approval - Examples from Malaysia

ChatGPT: The Entrepreneur's Coach in Crafting Financial Projections

How Credit Unions Can Harness the Power of AI in Lending

3 Machine Learning Insights For Deposit Tiering

AI and Big Data: Shaping the Future of Finance and Democratizing Credit Access

Use of AI/ML in credit scoring and underwriting

ABACUS digital Launches "ABACUS check" – Cutting-Edge AI for Bank Statement Analysis, Empowering Business through Tech Modernization

#WednesdayInsight: Revolutionizing Credit Underwriting with AI and Data-Driven Approaches

Harry Powell的更多文章

Graph use-case archetypes

Driving sustainable growth in banks by connecting customer data using a graph database

What questions should you ask of Chat-GPT based analytics platforms?

How to Think Differently

A business leader’s short guide to Graph Databases: What they are and why you need them.

A tribute to my InDigital colleagues at JLR

Thinking differently: Avoiding Optimisation 1/2

Graph Customer Similarity

Data Science Case Study 2: NLP Complaint Classification

How to be presented to

社区洞察

其他会员也浏览了

What’d I miss? News and Insights that got noticed - Week 39

How AI and Machine Learning are revolutionising the lending landscape

The Role of AI in Modern Loan Approval - Examples from Malaysia

ChatGPT: The Entrepreneur's Coach in Crafting Financial Projections

How Credit Unions Can Harness the Power of AI in Lending

3 Machine Learning Insights For Deposit Tiering

AI and Big Data: Shaping the Future of Finance and Democratizing Credit Access

Use of AI/ML in credit scoring and underwriting

ABACUS digital Launches "ABACUS check" – Cutting-Edge AI for Bank Statement Analysis, Empowering Business through Tech Modernization

#WednesdayInsight: Revolutionizing Credit Underwriting with AI and Data-Driven Approaches