登录查看更多内容

14 Risk Clinic - Upgrading MRM from Logistic to ML

Frank De Jonghe

EY Partner, Lead application of modelling, analytics & AI to Risk & Compliance across all industries

发布日期: 2024年10月28日

When a question comes up twice in a fortnight, it’s worth addressing. What extra tests should be done when machine learning models are used instead of classical logistic regressions?

A tradition originating in the nineties when computing power was limited, and subsequently engrained and canonised through regulatory rulebooks, has made the logistic regression model the de facto standard that everyone working in credit risk modelling, is deeply familiar with. Over time, a shared understanding of what good validation and monitoring looks like for such models has developed. Increasingly data science or machine learning (ML) models are being used for the steps in the credit life cycle that are less constrained by the regulatory canvas of Basel IRB or IFRS 9, such as underwriting decisions, marketing decisions, early warning signals and collateral valuation, leveraging on both the wealth of data and the readily available machine learning model codes. Other tasks that are in essence classification processes are increasingly supported by similar models, including in transaction monitoring, fraud detection, …

Given this ubiquity of more complex models, it is worth setting out a few model features that everybody should be aware of when moving from “old school” classifiers to working with big data driven ML ones. Where do we need to pay extra attention to ensure the model continues to operate as intended? Models are built ideally to support business decisions that matter for the organisation, a principle also recognised in SS 1-23 to signal the scope of its application. The use of (gen)AI to make all sort of processes more efficient, in essence also quickly leads to the need to understand the false positive/false negative trade-off in the different decision and branching points of a process flow (see Risk Clinic 2). The good news is that the outcome risk profile as defined by the use case or business decision is not changed by the use of a different, more complex algorithm, even if the process itself becomes more risky. It does mean though that the control environment needs to be scaled up, and the below sets out some of the things to consider.

By using the notion of classifier, I limit myself here to settings where there is a binary fork in the process flow: underwrite the customer or not, autoclose the transaction alert or not, involve a human operator in the customer request or stay in the automated straight-through mode, …

What stays the same?

·?The essential trade off remains the false positives/false negatives count that is traditionally captured via the confusion matrix for a classifier or the ROC curve and Gini coefficient for a continuous variable (score) that is translated in a classifier by the choice of a threshold.

·?Understanding the model’s behaviour on different segments of the population remains as important as usual. Machine Learning models often accommodate non-linearities (different behaviour in different segments) more readily within one model, but that does not exonerate the user from understanding such model performance. This is particularly relevant if one needs to understand potential decision biases (gender, ethnicity, …) in the portfolios.

·?Business decisions typically use a model’s output, the model’s output is not the decision. In credit modelling, this is called the use-test. Understanding the “decision accuracy” before and after interpretation and potential overriding of the model output by the human-in-the-loop, and understanding the interaction between model and human expert, is key. Notice that deciding to use a model’s output for straight-through automated decisions is also a business decision.

·?When calibrating the thresholds for a continues score to be used as a classifier, the (in)balance between the cost of false positives and false negatives determines the theoretical optimal threshold. Of course, those costs may not be accurately known. For example, in transaction monitoring within financial crime procedures, a false negative may have a multiple of the cost attached to it relative to a false positive that may only trigger some extra procedures and operational expense.

What requires some extra attention, reinforcement of the control environment?

On top of the tests and considerations that are familiar from the logistic model world, there are several characteristics of machine learning models that require the control framework to be reinforced, both in a formal validation phase and for ongoing monitoring during use. For each of the below, model development and model validation documentation should provide adequate understanding, justification and challenge:

·?Learning approaches (the recipe used to determine the model’s parameters) are much more varied for ML models as compared to the logistic regression. ML models will tend to exploit the available big data, using multitudes of features that are not a priori downselected on the basis of their individual predictive power. A technique called “regularisation”, which introduces a penalty if the classifier uses too many feature variables, is often used to keep the models focused on the most information rich variables. Moreover, training data may be used as they arise (get generated) through time (e.g. click through rates on a recommender model), leading to potentially constantly updated and evolving models. ?

领英推荐

Model risk management for banks in the AI paradigm…

Crisil 1 年前

AI in Compliance: Transforming Financial Crime…

Oonagh van den Berg (Lady) ???? 5 个月前

Is Artificial Intelligence (AI) & Machine Learning…

Mohammad Salman Khan 1 年前

·?The “penalty” parameter used in the regularisation is but one example of what are called “hyperparameters” in the machine learning model. These can also include such things as the depth in a modelling tree, or the number of simple models in an ensemble approach, … Each of these parameters provides a control knob of the model that can impact model performance. In theory they too should be calibrated using a two stage procedure.

·?Explainability. From the outset, there should be an approach identified to understand as best as possible the model outcomes. This can be at the aggregate level, but for certain applications, legislation (like GDPR) may require the explanation to be available at the level of each model application/decision. It is beyond the scope of this text to explore the different techniques for explainability, but they go under names such as Shapley values, partial dependency plots, feature importance analysis, ?surrogate models which may approximate a ML model locally with a linear (logistic) model, …

·?It is necessary to understand well the interpretation of a classifier’s output. In particular the interpretation of the probability to be in one class or another requires careful consideration.

·?ML models and ML modelers typically seem less focused on data quality and data curation than what we are accustomed to in the regulated credit modelling world. While this is partly driven by the volume, velocity and variety of data, this also relies on some theoretical insights. Regularisation, the recipe used to limit the number of features used in the modelling, can be shown (in least square settings) to be equivalent to having noise in the training data. Intuitively, if one starts heating up a magnet (introducing noise) it becomes at some point a demagnetised piece of iron (lost predictive power of having aligned micro-magnets). The idea is that some degree of noise in the training data is manageable without losing too much model performance, but at some point the signal is swamped by the noise.

·?Code implementation. Many ML models are intricate complex algorithms, for which different packages may be available. Benchmarking the package used with other, similar, ones on the same data is a must.

Where does that leave us from a business decision perspective?

It should be clear from the above that ML recipes and modelling approaches will typically not give “the one true model” that one is used to from credit logistic model applications (or so we are led to believe). There is a cloud of models, neighbours in model space as parametrised by the model’s parameters, that one cannot choose a preferred model from readily. Yet, even if models provide a continuous variable as output (e.g. a score), such a variable is usually mapped to a discrete variable (e.g. a credit rating grade) that drives the actual business decision (e.g. reject underwriting below rating X, light touch follow up investigation for a medium risk transaction alert, …). It is this ultimate output/decision that matters, and a slightly different model (neighbour in the model cloud) may lead to the same such business output. This is the level that determines whether a technical modelling issue or choice is actually a business issue. Understanding the impact of the modelling choices on the final business decision outcomes is key in focusing the technical validation activities on the most relevant aspects of the model development process.

References and further reading

1.????? Bank of England PRA, Model Risk Management Principles for Banks, Supervisory Statement 1/23 May 2023. It states that “Model use is defined as using a model’s output as a basis for informing business decisions” . Moreover “Business decisions should be understood as all decisions made in relation to the general business and operational banking activities, (…)”

2.????? For my pragmatic approach to Fair Algorithmic Decision making, see the first of a thread of 8 posts here: https://www.dhirubhai.net/pulse/fair-algorithmic-decisions-18-frank-de-jonghe/?trackingId=ELtt2utettJYSWJsCjkPGA%3D%3D

3.????? For a very good Review into Bias in Algorithmic Decision-making by the Centre for Data Ethics and Innovation, see https://assets.publishing.service.gov.uk/media/60142096d3bf7f70ba377b20/Review_into_bias_in_algorithmic_decision-making.pdf

要查看或添加评论，请登录

Frank De Jonghe的更多文章

21 Risk Clinic - What could possibly go wrong? Data!

2025年3月24日

21 Risk Clinic - What could possibly go wrong? Data!

There’s a wonderful statement from Descartes, roughly saying that there clearly is plenty of common sense in the world…
20 Risk Clinic - Adrift in a sea of data

2025年3月4日

20 Risk Clinic - Adrift in a sea of data

Model monitoring is a component of the standard model (risk) management cycle that will only gain importance as…

2 条评论
19 Risk Clinic – Model Validation: when is enough enough?

2025年2月10日

19 Risk Clinic – Model Validation: when is enough enough?

When discussing model validation assignments with clients, I often say (tongue in cheek, obviously): “If you give me…
18 Risk Clinic - Refineries of the new oil (part 2): Data Visualisation and Story Telling

2025年1月20日

18 Risk Clinic - Refineries of the new oil (part 2): Data Visualisation and Story Telling

This Risk Clinic continues the reflections on financial institutions as refineries of the new oil, namely abundant and…

1 条评论
17 Risk Clinic – Season Greetings – A tale of two models

2024年12月19日

17 Risk Clinic – Season Greetings – A tale of two models

Consider the following two models of the original, a passenger airplane: Model Plane 1 Model Plane 2 While I hold it in…
16 Risk Clinic – Refineries of the new oil (Part 1: Data Harvesting Strategy)

2024年12月10日

16 Risk Clinic – Refineries of the new oil (Part 1: Data Harvesting Strategy)

Social network information makes the circle of friends discoverable of somebody that just committed insurance fraud…
15 Risk Clinic – Einstein on risks and Expected Credit Loss

2024年11月11日

15 Risk Clinic – Einstein on risks and Expected Credit Loss

After the financial crisis, both USGAAP and the International Financial Reporting Standards moved to increase…
13 Risk Clinic - Time for a Large Credit Model?

2024年10月14日

13 Risk Clinic - Time for a Large Credit Model?

Have you ever wondered why every bank is (expected to make/making) its own PD model? The regulatory expectation to “use…

5 条评论
12 Risk Clinic – Chasing our tail: Which extreme scenarios are worth your time?

2024年9月30日

12 Risk Clinic – Chasing our tail: Which extreme scenarios are worth your time?

Taking the improbable seriously What risk scenarios and risk events to consider in the context of a medium term (say 5…

5 条评论
11 Risk Clinic – AI Risks seeping in: L’enfer, c’est les autres

2024年9月16日

11 Risk Clinic – AI Risks seeping in: L’enfer, c’est les autres

After the discovery phase in 2023, where we all explored and admired the progress in the functional capabilities of…

See all articles

14 Risk Clinic - Upgrading MRM from Logistic to ML

Frank De Jonghe

EY Partner, Lead application of modelling, analytics & AI to Risk & Compliance across all industries

领英推荐

Frank De Jonghe的更多文章

社区洞察

其他会员也浏览了

The Critical Need for Strong AI Governance in Business: Lessons from a Retail Disaster

Global Risk Community Monthly: March 2023 Edition

AI in Finance: Revolutionizing Risk Management and Fraud Detection in 2024

??? Navigating the Complexities of Generative AI: Balancing Opportunity and Risk

The importance of quality data in AI risk modelling: How data benchmarking can help

Digital transformation mindcandy 7 August 2024

Risk management meets Artificial Intelligence (AI) & they live happily ever after

How AI is Transforming Risk Management and Profitability in Margin Trading

Algorithmic Risk Management

Pioneering 'Cyber-Finance-Trust' Risk Management Frameworks of Practice: Bridging Networks, Systems & Controls Industry Standards

领英推荐

Frank De Jonghe的更多文章

21 Risk Clinic - What could possibly go wrong? Data!

20 Risk Clinic - Adrift in a sea of data

19 Risk Clinic – Model Validation: when is enough enough?

18 Risk Clinic - Refineries of the new oil (part 2): Data Visualisation and Story Telling

17 Risk Clinic – Season Greetings – A tale of two models

16 Risk Clinic – Refineries of the new oil (Part 1: Data Harvesting Strategy)

15 Risk Clinic – Einstein on risks and Expected Credit Loss

13 Risk Clinic - Time for a Large Credit Model?

12 Risk Clinic – Chasing our tail: Which extreme scenarios are worth your time?

11 Risk Clinic – AI Risks seeping in: L’enfer, c’est les autres

社区洞察

其他会员也浏览了

The Critical Need for Strong AI Governance in Business: Lessons from a Retail Disaster

Global Risk Community Monthly: March 2023 Edition

AI in Finance: Revolutionizing Risk Management and Fraud Detection in 2024

??? Navigating the Complexities of Generative AI: Balancing Opportunity and Risk

The importance of quality data in AI risk modelling: How data benchmarking can help

Digital transformation mindcandy 7 August 2024

Risk management meets Artificial Intelligence (AI) & they live happily ever after

How AI is Transforming Risk Management and Profitability in Margin Trading

Algorithmic Risk Management

Pioneering 'Cyber-Finance-Trust' Risk Management Frameworks of Practice: Bridging Networks, Systems & Controls Industry Standards