登录查看更多内容

Fair Algorithmic Decisions (7/8)

Frank De Jonghe

EY Partner, Lead application of modelling, analytics & AI to Risk & Compliance across all industries

发布日期: 2022年5月20日

This is part 7/8 of a series of posts on Fair Algorithmic Decisions. To find back all of them, please go to Posts on my LinkedIn profile. You can also contact me via LinkedIn messages to get the full document.

Manage: Understand the root cause

As said earlier, two dimensions need to be considered in careful detail before making a decision with respect to algorithmic fairness, namely the context of the decision process, and the root causes that led to the observed behaviour. We now turn to the latter. There are numerous potential approaches to determining whether an algorithm represents risks to fairness in view of the multitude of fairness definitions. The situation becomes even more unwieldy when considering the potential root causes for the issues with the algorithmic decisions, and ways to address the shortcomings. We refer to the literature for further guidance, and limit ourselves to some pointers here.

(ii) The data collection process of the training data may itself be skewed and already introduce an incomplete view of the world. Think for example of election polls that often get it wrong. The channel by which people are contacted for such surveys, but even more their propensity to answer honestly, create bias in the political opinion prevalent in the sample of answers. So a good understanding of how the data was collected, is key. This is obviously a bigger challenge with purchased data (sample selection - completeness). Similarly, the decision in the UK to use algorithms to give students A-level score during the COVID pandemic, unwittingly introduced a bias in favour or richer kids, given that schools that historically had better success rates, where also more expensive ones (sample selection – understanding reality).

To train a model for a decision process, one needs labelled data, the outcome target that the algorithm tries to predict. Sometimes historical data are objective (did someone ever get 90 days past due on his loan, or did a student obtain a certain minimum grade within x years), but sometimes these historical data labels are potentially incorporating a less accurate assessment of the underlying truth, either because it is not known with certainty (was an aborted transaction really a fraud?) or because the human assessor had a bias (surely people from that community should not be easily integrated as employees?) (data labeling). Not only does the mis-labeling issue potentially make historical unfairness persist, it also harms classifier performance relative to the true underlying classification.

Another source of breaches of fairness could be that the algorithm unwittingly found a proxy for a protected variable. I.e. some combination of the different features in the model, not obvious during a casual model analysis, is well correlated with a protected variable, and this combination de facto drives the algorithm outcome. This can lead to indirect discrimination. The more complex the model (e.g. deep learning), the harder it will be to detect that such mechanism is at work.

Furthermore, it may be that the model is trained for a wrong objective using the wrong features. The choice of the objective function, and the universe of features being considered for explaining the outcome, may reflect inherent bias in the world view of the ones building the model. Consider for example a model or scorecard that in the context of a recruitment process tries to assess whether a candidate would be a good fit to the team. What constitutes loyal team membership and what characteristics contribute to that, is likely very gender, age and culture dependent. So the model design may reflect the views of the developers, and hence the composition of the developer team. On the other hand, this is also the type of cases where one should really be questioning whether it is ethical to develop such an algorithm at all, given one is on scientifically extremely thin ice in translating such “decisions” in an algorithm. Algorithms are not value-free.

领英推荐

URSABLOG: Computer Narratives

Simon Ward 8 个月前

Turns out de minimis is da big deal

Chris Feola 1 个月前

Adversarial Conditions

Morten Rand-Hendriksen 3 年前

Once one has identified these root causes, one can obviously start addressing the issue. This is, of course, a delicate endeavour, both technically (how to achieve the stated fairness goal) and content wise (what is the fairness we are aiming to achieve). The same level of rigorous governance that one applies to the algorithm development should therefore be used in this part of the management process too (documentation, who signs off, …). We discuss some of the options next.

Manage: Some technical options

Having identified the root cause of the algorithmic fairness problem should give a hint as to what can be done to address it. The literature often speaks about pre-training, in-training and post -training mitigation actions. We will briefly look at them in this order.

Pre-training remediation aims to address the problems with the original data set. This can range from redesigning the sample collection process, or the target labeling process, to ensuring that all the segments are equally represented through some resampling technique or the creation of synthetic data. As said, all this should be done with the utmost care and within strict approval governance.

Careful feature vetting is an important safe guard in any procedure aiming to control the risk of algorithmic unfairness. For some applications, correlation may be good enough (e.g. marketing recommender systems), while for others it is important to understand that a feature used in an score card is causally related to the outcome one tries to measure. Correlation is necessary, but the causal understanding makes it appropriate for modelling (a formal two step test). Insights as to what constitutes a genuine causal link, can evolve over time, and require a deep understanding of the business context of the problem being modelled. This is where it becomes relevant to remain on one’s guard when it comes to proxy variables that the algorithm may have uncovered unwittingly. One powerful graphical means is plotting the Information Value of a feature relative to both the target (say default) and to the protected variable (say gender). One prefers features that are information rich on the target and information poor on the protected variable.

A complex in-training technique to address imbalance is adversarial learning. During the training, one modifies the predictor on the target as long as it is also a good predictor on the protected variable. This most readily shows that de-biasing may come at a price in terms of predictive accuracy.

Provided that the rank ordering of the algorithm is comparable between different segments, and that the problem is “merely” that score distributions are not the same over the different segments, by far the simplest approach to re-equilibrate the different segments, is through a post-training model calibration, where (for instance) a linear transformation is applied to the score, the logarithm of the odds of the positive outcome.

要查看或添加评论，请登录

Frank De Jonghe的更多文章

20 Risk Clinic - Adrift in a sea of data

2025年3月4日

20 Risk Clinic - Adrift in a sea of data

Model monitoring is a component of the standard model (risk) management cycle that will only gain importance as…

2 条评论
19 Risk Clinic – Model Validation: when is enough enough?

2025年2月10日

19 Risk Clinic – Model Validation: when is enough enough?

When discussing model validation assignments with clients, I often say (tongue in cheek, obviously): “If you give me…
18 Risk Clinic - Refineries of the new oil (part 2): Data Visualisation and Story Telling

2025年1月20日

18 Risk Clinic - Refineries of the new oil (part 2): Data Visualisation and Story Telling

This Risk Clinic continues the reflections on financial institutions as refineries of the new oil, namely abundant and…

1 条评论
17 Risk Clinic – Season Greetings – A tale of two models

2024年12月19日

17 Risk Clinic – Season Greetings – A tale of two models

Consider the following two models of the original, a passenger airplane: Model Plane 1 Model Plane 2 While I hold it in…
16 Risk Clinic – Refineries of the new oil (Part 1: Data Harvesting Strategy)

2024年12月10日

16 Risk Clinic – Refineries of the new oil (Part 1: Data Harvesting Strategy)

Social network information makes the circle of friends discoverable of somebody that just committed insurance fraud…
15 Risk Clinic – Einstein on risks and Expected Credit Loss

2024年11月11日

15 Risk Clinic – Einstein on risks and Expected Credit Loss

After the financial crisis, both USGAAP and the International Financial Reporting Standards moved to increase…
14 Risk Clinic - Upgrading MRM from Logistic to ML

2024年10月28日

14 Risk Clinic - Upgrading MRM from Logistic to ML

When a question comes up twice in a fortnight, it’s worth addressing. What extra tests should be done when machine…
13 Risk Clinic - Time for a Large Credit Model?

2024年10月14日

13 Risk Clinic - Time for a Large Credit Model?

Have you ever wondered why every bank is (expected to make/making) its own PD model? The regulatory expectation to “use…

5 条评论
12 Risk Clinic – Chasing our tail: Which extreme scenarios are worth your time?

2024年9月30日

12 Risk Clinic – Chasing our tail: Which extreme scenarios are worth your time?

Taking the improbable seriously What risk scenarios and risk events to consider in the context of a medium term (say 5…

5 条评论
11 Risk Clinic – AI Risks seeping in: L’enfer, c’est les autres

2024年9月16日

11 Risk Clinic – AI Risks seeping in: L’enfer, c’est les autres

After the discovery phase in 2023, where we all explored and admired the progress in the functional capabilities of…

See all articles

Fair Algorithmic Decisions (7/8)

Frank De Jonghe

EY Partner, Lead application of modelling, analytics & AI to Risk & Compliance across all industries

领英推荐

Frank De Jonghe的更多文章

社区洞察

其他会员也浏览了

The [3N] Method

Gaming Data Gives No Edge. Instead, Game the Data Devoted

Is News Sentiment Still Adding Alpha?

Forecasting Potential Long Term Bond Performance with Machine Learning in a Declining Interest Rate Environment

R1 - Reasoning as a 'thing' to be witnessed.

The Secret Shrinkage Sauce

Forget Perplexity: this is the real future of financial research

The limits of algorithms in competitive intelligence?

Financial Market Predictions with Solvent GPT: A Game Changer in Stock Trading

The Truth About LLMs and Numeric Accuracy

领英推荐

Frank De Jonghe的更多文章

20 Risk Clinic - Adrift in a sea of data

19 Risk Clinic – Model Validation: when is enough enough?

18 Risk Clinic - Refineries of the new oil (part 2): Data Visualisation and Story Telling

17 Risk Clinic – Season Greetings – A tale of two models

16 Risk Clinic – Refineries of the new oil (Part 1: Data Harvesting Strategy)

15 Risk Clinic – Einstein on risks and Expected Credit Loss

14 Risk Clinic - Upgrading MRM from Logistic to ML

13 Risk Clinic - Time for a Large Credit Model?

12 Risk Clinic – Chasing our tail: Which extreme scenarios are worth your time?

11 Risk Clinic – AI Risks seeping in: L’enfer, c’est les autres

社区洞察

其他会员也浏览了

The [3N] Method

Gaming Data Gives No Edge. Instead, Game the Data Devoted

Is News Sentiment Still Adding Alpha?

Forecasting Potential Long Term Bond Performance with Machine Learning in a Declining Interest Rate Environment

R1 - Reasoning as a 'thing' to be witnessed.

The Secret Shrinkage Sauce

Forget Perplexity: this is the real future of financial research

The limits of algorithms in competitive intelligence?

Financial Market Predictions with Solvent GPT: A Game Changer in Stock Trading

The Truth About LLMs and Numeric Accuracy