登录查看更多内容

Exploring Interpretable Scorecard Boosting

Denis Burakov

发布日期: 2023年5月23日

Credit scorecards provide lenders with a standardized and objective method to assess credit risk and make informed, automated lending decisions. They play a crucial role in streamlining the lending strategy, minimizing the potential for human bias, and enabling effective risk management.

One of the key methods utilized in scorecard development is the Weight-of-Evidence Logistic Regression (WOE LR) model design. These "glass-box" models have been in use for decades and are known for their ease of interpretation. What is often overlooked, they have demonstrated reliability in production and can be deployed using SQL alone.

However, a limitation of these models is their inability to describe the risk attributes of different segments and products within a single scorecard due to their relative simplicity. Beyond that, the use of numerous scorecards has several drawbacks. Processes such as beta tuning and retraining, validation, versioning, deployment, and maintenance require significant manual effort and subject matter expertise.

In this post, building upon previous work by Scotiabank's ML team, I will follow their proposed credit risk modeling technique based on scorecard boosting. This version of good-bad analysis can enhance scorecard performance without compromising model interpretability, which is crucial for explaining credit decisions to model users, customers, and regulators alike.

Scorecard boosting

Credit scoring is a high-stakes field where accuracy and interpretability are crucial for real-world usability. As a result, many complex "black-box" algorithms have not gained widespread adoption in lending. An excellent example highlighting this is the outcome of the FICO Explainable Machine Learning Challenge, where Duke University's Team emerged as winners by building a fully transparent "glass-box" risk model. Their model not only outperformed more complex alternatives in terms of interpretability but also exhibited superior predictive power.

The need for reliable scoring models is also highlighted by regulation and model risk management best practices. A corollary of this requirement is what is known as "accuracy-explainability" trade-off: it may not be feasible to increase one dimension without decreasing the other.

In his lecture Machine Learning for Retail Credit Risk for NVIDIA, Paul Edwards presented a well-known accuracy-explainability diagram that specifically addressed the intricacies of the credit risk use-case:

No alt text provided for this image — Adapted from "Machine Learning in Retail Credit Risk: Algorithms, Infrastructure, and Alternative Data — Past, Present, and Future" - NVIDIA

As a challenger approach to WOE LR, Scotiabank's ML team proposed a boosting technique for credit scorecard development. Such models, as was demonstrated in the presentation, have the same positive properties of a linear model, yet are based on a more advanced tree-based estimation which yields substantial gains in scoring accuracy.

In the next sections, we will explore how a prototype of a boosted scorecard can be built based on FICO-xML-Challenge dataset.

How does it work?

A boosted scorecard can be described as an ensemble of decision trees that are trained iteratively using the gradient boosting algorithm. In the context of credit decisioning, specific constraints such as monotonicity (to control for the relationship with the default rate) and interaction (to eliminate potential feature interactions) are often applied.

Since we're working with a binary target variable, the predictions in leafs generated by each consecutive tree can be interpreted as log odds. Consequently, they can be converted into scorecard points, similar to the WOE LR approach as Weights and Biases' notebook and model card have shown.

To demonstrate the application of boosted scorecards, we fit a model using a pre-defined set of features with monotonic trends from the OptBinning library's example. Our simple boosted scorecard with a maximum tree depth of 1 consists of 45 individual trees. Below we can see the first five trees:

It can be seen from the table above that customers with a credit bureau score less than 74 are assigned 0 points, while those above 74 receive 24 points in the first iteration (Tree 0).

We validate our scorecard by checking against XGBoost model's first and second tree plots:

领英推荐

How AI Optimizes Risk Management and Credit Scoring

Blockchain Council 5 个月前

Unleashing the Power of Data as a Service (DaaS) in…

Yubi 1 年前

Abrigo January Content Recap

Abrigo 1 个月前

After fitting the scorecard, we can apply model predictions in scorecard points on the test data. The ultimate goal of scorecard development is to determine a cut-off point, which can be referred to as a "sweet spot," where most of the high-risk applicants are rejected while retaining the majority of low-risk applicants.

We further visualize this threshold in the following chart:

As expected, a higher credit score corresponds to lower risk. By setting a cut-off point of less than ~30 points for our known good-bad population, we can reject approximately 80% of bad risk.

Model interpretability

A key advantage of boosted scorecards is their global and local interpretability. Since the final score is derived from the sum of points assigned to each feature in our scorecard, feature importances are straightforward and easy to calculate.

We can visualize scorecard's custom feature importances below:

We can observe that external score has the highest impact on predictions following by delinquency and utilization features.

To validate our results, we can further look at a similar diagram using the SHAP global importance plot, which is a common model-agnostic method for interpreting model results:

As we can see from the list of top features selected by SHAP, both methods yield highly consistent results, confirming the reliability and interpretability of the boosted scorecard model.

Concluding remarks

Explainability is a crucial element in credit decisioning, as consumers have the right to understand why their applications were rejected and to question the data and methods behind the decision. While scorecards based on linear models have traditionally been considered the gold standard for standardized and reliable credit risk assessment, a more advanced technique called scorecard boosting carries potential in boosting predictive power of models without sacrificing their explainability.

I hope you have enjoyed reading this post!???

The technical appendix with the code can be found in my?GitHub.

All views expressed are my own.

Dieter Kurakov

Lead Product Owner IRB models at S-Kreditpartner

1 年

Fatih Akdere

1 次回应

Paul Edwards

Director of Risk Decisions at Wealthsimple

1 年

Great work! Love this

1 次回应

Agostino Calamia

Building AI products with a strong focus in product management

1 年

Very intersting Denis! Do you know any company that applies this already?

1 次回应

查看更多评论

要查看或添加评论，请登录

Denis Burakov的更多文章

Designing AI Underwriters

2025年3月25日

Designing AI Underwriters

The integration of AI into modern loan underwriting is set to reshape how institutions approach credit decisioning. As…

6 条评论
Validating Tree-Based Risk Models

2024年6月12日

Validating Tree-Based Risk Models

Boosting is a fundamental concept in machine learning that has achieved remarkable success in binary classification…

9 条评论
Scorecarding with Na?ve Bayes

2024年2月20日

Scorecarding with Na?ve Bayes

In the consumer lending domain, credit scorecards serve as the fundamental pillars for decision-making processes in…

5 条评论
Balancing Risk and Profit

2023年10月17日

Balancing Risk and Profit

Understanding profit independently of risk is increasingly vital for lenders to create monetary value through proper…

7 条评论
Building Random Forest Scorecards

2023年9月20日

Building Random Forest Scorecards

In the lending industry and credit risk research, a risk practitioner can often encounter Weight-of-Evidence logistic…

6 条评论
Benchmarking PD Models

2023年9月5日

Benchmarking PD Models

When evaluating various scoring functions for the Probability of Default (PD) modeling, the most commonly assessed…

7 条评论
Unlocking Lending Profitability with Risk Modeling

2023年8月23日

Unlocking Lending Profitability with Risk Modeling

In earlier times, access to banking services required direct in-person communication with a bank officer. The outcomes…
Understanding LGD Risk

2023年7月17日

Understanding LGD Risk

The Loss Given Default (LGD) is a credit risk parameter that plays an important role in contemporary banking risk…

14 条评论
Leveraging Profit Scoring in Digital Loan Underwriting

2023年6月28日

Leveraging Profit Scoring in Digital Loan Underwriting

Traditional loan approval process relies heavily on consumers’ credit bureau scores, debt-to-income (DTI) ratios, and…
Measuring the Benefits of Credit Risk Model Use

2023年3月10日

Measuring the Benefits of Credit Risk Model Use

When developing credit risk models, risk practitioners tend to focus on quantitative metrics such as the Gini…

See all articles

Exploring Interpretable Scorecard Boosting

Denis Burakov

Scorecard boosting

How does it work?

领英推荐

Model interpretability

Concluding remarks

Denis Burakov的更多文章

社区洞察

其他会员也浏览了

Top 10 New Year's Resolutions for Credit Managers

ABACUS digital Launches "ABACUS check" – Cutting-Edge AI for Bank Statement Analysis, Empowering Business through Tech Modernization

Development of credit scoring models and Cut-off score setting in R.

AI-Powered Credit Scoring and Risk Assessment: Unlocking Financial Inclusion for the Underserved

The Power of Data: Redefining Credit Decisioning with Analytics

AI in Credit Scoring: Boosting Precision in Risk Evaluation

April Edition: Navigating SBA Lending, Exploring Public Float, and Advancing with Data Solutions

Credit data trends: what to watch in 2025

The Role of Big Data in Financial Services: Transforming the Industry

Risk and Compliance Roundup

Scorecard boosting

How does it work?

领英推荐

Model interpretability

Concluding remarks

Denis Burakov的更多文章

Designing AI Underwriters

Validating Tree-Based Risk Models

Scorecarding with Na?ve Bayes

Balancing Risk and Profit

Building Random Forest Scorecards

Benchmarking PD Models

Unlocking Lending Profitability with Risk Modeling

Understanding LGD Risk

Leveraging Profit Scoring in Digital Loan Underwriting

Measuring the Benefits of Credit Risk Model Use

社区洞察

其他会员也浏览了

Top 10 New Year's Resolutions for Credit Managers

ABACUS digital Launches "ABACUS check" – Cutting-Edge AI for Bank Statement Analysis, Empowering Business through Tech Modernization

Development of credit scoring models and Cut-off score setting in R.

AI-Powered Credit Scoring and Risk Assessment: Unlocking Financial Inclusion for the Underserved

The Power of Data: Redefining Credit Decisioning with Analytics

AI in Credit Scoring: Boosting Precision in Risk Evaluation

April Edition: Navigating SBA Lending, Exploring Public Float, and Advancing with Data Solutions

Credit data trends: what to watch in 2025

The Role of Big Data in Financial Services: Transforming the Industry

Risk and Compliance Roundup