An opinion on two topics of AIML…

An opinion on two topics of AIML…

Like other technological advancements throughout human history, further development and use of Artificial Intelligence and Machine Learning (AIML) is here to stay. And like another other change that will augment the way we live and work, we as consumers and users can tend to have misconceptions of it.

After I present the business benefits that AIML can generate for a firm, attendees approach me to supply opinions on two topics. These are 1) Loss of control and 2) Impact on the jobs front. Wanted to share my opinion on each of these topics in this article.

?

Loss of control: How ML (Machine Learning) Algorithms produce their outputs.

Of course, questions on AIML such as “How far do we let it evolve?” “Will it always rely on human programming?” and “Where is the AI we already take for granted leading us?” are serious ones (Eggleton, 2017). But there is a renewed focus on both:

·????????interrogating an AIML system means understanding the inherent statistical mechanics of how it worked to derive conclusions. This involves an analyst directly interpreting the monotonicity, additivity, sparsity, near-orthogonality, linearity, smoothness, and the mathematical function so that they can understand the method taken by the model to use inputs to derive the outputs.

·????????explaining the important inputs of the AIML that acts as drivers of the output predictions as a posthoc exercise using model agnostic measures.

By design, there are certain AIML algorithms that are not meant to be interpretable or explainable. This is because, at times, these algorithms are not human-readable and beyond our comprehension meaning. A key example to use here is neural networks; these are mathematical algebra that is flexible is their versatility to model any non-linear association between input variables and the target variable, and variable interactions. There are many architectures that have been proposed for neural networks that differ from one another in the degree of complexity based on activation functions and layers, but essentially if one wants to mechanically interpret how outputs are generated through the architecture then one would need to interpret how each neuron in each layer processes inputs and where/ how it transmits these as an output value to each neuron in the subsequent layer. An analyst would need to perform a tracing exercise on how numbers behave inside the architecture. The trouble with such an exercise is that it is difficult to interpret the architecture and neuron behavior to a meaningful extent to understand why a neural network architecture has classified say a photo or a spoken word in a certain way (Brynjolfsson & McAfee, 2017). Again, by design, the architecture and neuron behavior are machine-readable and not human-readable.

However, despite the mentioned human-readable limitations, there are ways that AIML models like neural networks can be both interpreted and explained. Below I have detailed selected ways that, importantly, do not perturb the network to the extent that it becomes linear-based or that force conditional logic on how weights assignment takes place:

·????????Interpretation.

o??activation maximization: where analysis finds the inputs that return an output with the highest confidence. This involves a search for input patterns that produce the maximum model response for a given quantity of interest in an iterative way and is restricted to being applied to neurons in the top layer of the network (Berkes & Wiskott, 2006; Montavon, et al., 2018).

o??Sensitivity analysis: that involves analyzing the model’s prediction gradient so that information on how the input variables affect the target variable can auto-generate. There are methods developed to perform the analysis including:

§?Neural Interpretation Diagram (NID) which changes the color and thickness of the connections between neurons based on the sign and size of its weight (?zesmi & ?zesmi, 1999).

§?Garson’s (Garson, 1991) and Olden’s (Olden, et al., 2004) method for variable importance which sums the product of either the absolute value or real value of the weights that connect the input variable to the target variable through the hidden layer, for the former and latter methods, respectively. Garsen’s method then scales the result compared to all other input variables, while Olsen’s does not.

§?Methods that change the value of one input variable while other variables are at a constant value. Examples include the:

·????????Input Perturbation method (Scardi & Harding, 1999) that involves an algorithm adding slight changes to each variable, but one at a time while the others stay the same. The input variable whose changes affect the output variable the most are the ones with the most influence.

·????????Profile method (Lek, et al., 1996) which is where an algorithm computes a fictitious matrix that pertains to a range of equal intervals between the minimum and maximum values of the input variables, and the chosen number of intervals is then set as a scale. The algorithm then generates five values for each scale as a point based on: 1) minimum value, 2) first quartile, 3) median, 4) third quartile and 5) maximum value. The algorithm then converts the five values to a median value and then the profile of the output variable is plotted for the scale’s value for each input variable. This gets repeated for each of the input variables. The curve supplies a profile of the variation of the target variable according to the increase of the input variables.

o??Influential instances. Here, an influence function removes one of the instances of the training data and assesses the impact of the instance removal on the parameter or predictions. At a conceptual level, influential instances are like finding whether there is confounding in training data; that there is a third variable (or observational segment) that is independently associated with both the independent variables and target variable. When this occurs, the confounder can mask an actual association or falsely imply that a relationship exists between the input variables and target.

Coming back to Influential instances, it is important to understand that the influential instance does not manipulate the input variables that the learner of a neural network has used to learn the target by use of the backpropagation to find best weights by a function. Rather, the function removes instances from the training data so that one can understand which instances most influence the model parameters or predictions. ?Examples of statistical-based methods for influential influence include influential functions that approximate how the model changes when the instance is weighted in the sum of loss over the training data which requires the loss gradient of the model parameters (Koh & Liang, 2017).

?

·????????Explainability.

Model agnostic measures can explain the effectiveness of an algorithm, including a neural network. Examples include:

o??Partial Dependence (PD) explores the functional relationship between the individual model inputs and the predictions of the model. It can also show how the model's predictions depend on the values of the input variables of interest, considering the influence of all other characteristics.

o??independent conditional expectation (ICE) which shows how the inputs (independent variables) are related to the outcome (dependent variable). Another way to understand ICE is that it represents a simulation that shows what would happen to the model’s prediction if a single observation of an independent variable is changed.

o??Local Interpretable Model Agnostic Explanations (LIME) which explains the predictions of any classifier by fitting a linear regression to the original model inputs using prediction probability as the target. The linear regression model here has as its dependent variable the prediction probability and not the target variable; this means that LIME is not a representation of the variables that are important to the learning outcome. The LIME graphs are the coefficients for the parameter estimates of the localized linear regression model. As a variable increases, it may have either a positive or negative effect on the prediction for that cluster.

?

Wanted to also point out one important note. The above explanation of how firms can interpret and explain AIML models like neural networks means that firms can be more confidence to use them to uncover nonlinear relationships between input variables and the input variables and the target variable. This can lead to more predictive forecasting. Also importantly, AIML models do not need to make the same assumptions made by linear/ logistic regression models, or “white box” models, namely that:

·????????a linear combination of variables should be present

·????????the target variable is normally distributed

·????????average error is the same for each variable

·????????each instance/ observation is independent?

·????????variables do not have measurement error

·????????complete absence of correlation between variables/ absence of multicollinearity

?

Impact on the jobs front.

On the jobs front, 30-40% of jobs that exist today will not exist, if the world rapidly adopts AI by 2030 (Manyika, et al., 2017; Price Waterhouse Coppers, 2018; McClean, 2020). Such a dramatic predicted change does not come as a great surprise when considering that 88% of job losses in the manufacturing sector between 2000 and 2010 were due mostly in part to productivity growth that is associated with advancements in robotics and associated automation (Hicks & Devaraj, 2017). The good news is that in the wake of the replacement of manual jobs and cumbersome processes, there will be a need for jobs that require skill in understanding both the mechanics of how AI works and aspects of outputs based on AI decisions or models (Bughin, et al., 2018). Consider this fact as an example of change that will occur rather than the predicted huge job losses; although 88% of manufacturing jobs no longer exist in America due to robotics and automation there has been a gain of 9.7 million jobs in other industries such as telecommunications and insurance (Hicks & Devaraj, 2017). In addition, one of the largest global professional services and technology companies Accenture has managed to automate 17,000 jobs without one single redundancy by creating jobs that combine both technical skills and a continuous learning mindset (Brinded, 2017).

Although if we take experience from the United States manufacturing sector as a litmus test, the number of jobs replaced by AI-enabled automation will not outpace the number created (Hicks & Devaraj, 2017). The other good indicator that massive unemployment is not likely to be created in the wake of larger-scale AI integration is a study released by the McKinsey Institute that has concluded that although half of most work activities can be automated by AI less than 5% of jobs can be entirely replaced (Bughin, et al., 2017).?

Despite that likelihood of massive job losses will not be experienced, a senate inquiry in Australia has been created to assess impacts to the estimated 3.5 million Australian workers (Smith, 2018) that will have their jobs in some way impacted by AI and digital disruption and to form guidelines that can be used for future government policy (Smith, 2018). ?

?

Current and future jobs created in the wake of further AI adoption by the world will require more skill in persuasion, emotional intelligence, and teaching others and less so on narrow technical skills (World Economic Forum, 2016); surely a large win for most institutions? Individuals that can merge both the required analytic programming skills with those considered softer will surely be the transformed worker of the next Industrial age. Thus, although in the age of AI we will not be facing the ‘end of work,’ t,’ transformation impact on work sectors and the economy at large will be profound (Brynjolfsson & Mitchell, 2017). An interesting note to make is that in addition to the transformational impact on jobs, AI may increase Global GDP (Gross Domestic Product) by 14% by the year 2030 and this will add an equivalent of 15.7 trillion dollars (about $48,000 per person in the US) to the global economy and this makes AI one of the biggest commercial opportunities (Rao, et al., 2017).

?

Concluding remarks.

Here I have shared my opinion on two questions attendees ask after I have made a presentation: one on the loss of control of AIML and on jobs. To sum up my opinion on both these questions: 1) there are certain AIML algorithms that are not meant to be interpretable or explainable, but there are statistical measures that have been used to both interpret and explain them such as those documented here for neural networks and 2) AIML will augment our jobs by automated the more mundane manual parts of them that can create cumbersome processes.


Bibliography

Berkes, P. & Wiskott, L., 2006. On the analysis and interpretation of inhomogeneous quadratic forms as receptive fields. Neural Computation, 18(8), pp. 1868-1895.

Brinded, L., 2017. Automation killed 17,000 roles at a huge tech and services firm- but no one actually lost their job., New York, NY, USA: Business Insider.

Brynjolfsson, E. & McAfee, A., 2017. The Business of Artificial Intelligence- What it can and cannot do for your organisation., Boston, MA, USA: Harvard Business Review.

Brynjolfsson, E. & Mitchell, T., 2017. What can machine learning do? Workforce implications.. Science, 358(6370), pp. 1530- 1534.

Bughin, J. et al., 2017. Artificial Intelligence: The Next Digital Frontier, London, United Kingdom: McKinsey Global Institute.

Bughin, J. et al., 2018. Notes From The AI Frontier Modeling The Impact Of AI On The World Economy, London, United Kingdom: McKinsey Global Institute.

Eggleton, M., 2017. Setting the framework for effective regulation, Melbourne, Australia: Australian Financial Review.

Garson, G., 1991. Interpreting Neural-Network Connection Weights. AI Expert, 6(4), p. 46–51.

Hicks, M. & Devaraj, S., 2017. The Myth and Reality of Manufacturing in America, Muncie, Indiana, USA: Ball State University Centre for Business and Economic Research.

Koh, P. & Liang, P., 2017. Proceedings of the 34th International Conference on Machine Learning. Cambridge, MA, USA, Proceedings of Machine Learning Research, pp. 1885--1894.

Lek, S. et al., 1996. Role of some environmental variables in trout abundance models using neural networks.. Aquatic Living Resources, Volume 9, pp. 23-29.

Manyika, J. et al., 2017. Jobs lost, jobs gained: What the future of work will mean for jobs, skills, and wages, New York, NY, USA: McKinsey Global Institute.

McClean, T., 2020. Are Robots Eating Our Jobs? Not According To AI, Jersey City, New Jersey, USA: Forbes.

Montavon, G., Samek, W. & K.R, M., 2018. Methods for interpreting and understanding deep neural networks. Digital Signal Processing, 73(2018), pp. 1-15.

Olden, J., Joy, M. & Death, R., 2004. An Accurate Comparison of Methods for Quantifying Variable Importance in Artificial Neural Networks Using Simulated Data.. Ecological Modelling, 178(3-4), p. 389–397.

?zesmi, S. & ?zesmi, U., 1999. An Artificial Neural Network Approach to Spatial Habitat Modelling with Interspecific Interaction. Ecological Modelling, 116(1), p. 15–31.

Price Waterhouse Coppers, 2018. UK Economic Outlook, London, United Kingdom: Price Waterhouse Coopers LLP.

Rao, A., Verweij, G. & Cameron, E., 2017. Sizing the Prize- What’s the real value of AI for your business and how can you capitalise?., London, United Kingdom: Price Waterhouse Coopers.

Scardi, M. & Harding, L., 1999. Developing an empirical model of phytoplankton primary production: a neural network case study.. Ecological Modelling, 120(2-3), pp. 213-223.

Smith, P., 2018. Big four banks pledge to step up on tech jobs threat and help plot the future of work., Melbourne, Victoria, Australia: Australian Financial Review.

World Economic Forum, 2016. The Future of Jobs- Skills Stability. , Cologny, Switzerland: World Economic Forum.

Ronaldo Rossi da Costa

Actuary | MIBA | Risk Management | Insurance | IFRS 17 | ORSA | SAS Educator

1 个月

Nice post. Still vey handy over the time. Thanks!

Arjun Sunder S, CA,FRM

Credit Risk Models|IRB|IFRS-9|Basel III| Regulatory Reporting |Model Risk Management| ICAAP |Stress Testing| Econometric models

2 å¹´

Thanks for posting. Quite useful toolkit for improving interpretation and explainablity of AI/ML Models!!

要查看或添加评论,请登录

Stephen J. Tonna的更多文章

  • Synthetic Data and Data Catalogs: Complementary Solutions or Core Answers?

    Synthetic Data and Data Catalogs: Complementary Solutions or Core Answers?

    Synthetic data and data catalogs offer valuable approaches to address challenges around privacy, bias, data scarcity…

  • Validating and monitoring AI & machine learning risk management models

    Validating and monitoring AI & machine learning risk management models

    Stephen Tonna and Terisa Roberts One of the toughest challenges of models created with AI and machine learning is how…

    1 条评论
  • Transformation Efforts: Why are they so hard?

    Transformation Efforts: Why are they so hard?

    Over the years, I have been part of large projects being run by private and public companies that have been trying to…

    1 条评论
  • Climate Change Risk: What is all the fuss?

    Climate Change Risk: What is all the fuss?

    As I sit quite solitary in my studio room during what is now the sixth- and fingers crossed very last- lockdown here in…

    2 条评论
  • LinkedIn and the 'in-between career period'

    LinkedIn and the 'in-between career period'

    LinkedIn: I still can’t believe the importance this social media tool played in my career transition from medical…

    5 条评论

社区洞察

其他会员也浏览了