Algorithms and black boxes
IMAGE: Ilya Akinshin?—?123RF

Algorithms and black boxes

Some US banks report problems with their machine learning algorithms, particularly those that decide which customers should be granted loans: as the complexity of the models increases, the interpretability of those models decreases.

Algorithms have provided banks with efficient models that reduce their exposure to unpaid loans, but that are basically a black box that can also generate a range of problems. Hari Gopalkrishnan, technology manager at Bank of America, notes in an article:

“In banking, [w] e’re not fans of lack of transparency and black boxes, where the answer is just ‘yes’ or’ no. We want to understand how the decision is made, so that we can stand behind it and say that we’re not disfavoring someone.”

The problem, which I wrote about a little over a year ago based on a series of conversations with my friends at BigML, a company that employs me as a strategic advisor, is one that many companies will face as machine learning becomes more complex, given the tendency to use the most sophisticated models within our reach without taking into account how to interpret increasingly complex models based on larger and larger amount of data.

Early phases and rapid prototyping of machine learning projects, where a reasonable result is enough, tend to rely on logistic regression or decision trees, which are relatively simple to interpret and do not require lengthy periods of training. But as we move towards the intermediate phases, where we’re looking for optimized and proven results, we tend to evolve toward more complex models with better representation, such as those based on decision forests. And when we reach the final phases, where an algorithm performance proves critical, then we lean towards boosted trees and deepnetsor deep learning. The appeal of this progression is evident given that it tends to improve the representation, fit and performance of the model, but the downside is clear: these improvements require longer training times and most importantly, are less susceptible to interpretation. When the model reaches a certain level of complexity, the chances of correlating a result with its input variables are reduced significantly, interpreting the causality is more difficult and demonstrating that a decision has not been taken on the basis of potentially discriminatory criterion potentially creates problems that, given legislation such as the Equal Credit Opportunity Act, aimed at preventing discrimination based on variables such as race, religion, origin, sex, marital status or age, can lead to lawsuits.

The value of machine learning is not in setting increasingly complex models, but in making it easier to use. Businesses are complex processes, despite our efforts over the last 150 years since the Industrial Revolution to apply simple rules. The black box issue is important and requires mechanisms to add transparency and that try to arrive at explanations about the predictions made by algorithmic models, as well as being a restriction that companies must take into account when it comes to scale their machine learning initiatives; a long and complex process in which 90% of the effort is invested in processes such as defining objectives, managing and transforming data and feature engineering, with only the final 10% applied to what we traditionally consider the result: predictions and impact measurement.

The issue for most companies nowadays is not that machine learning doesn’t work, but that they struggle to actually use it. It’s going to be a long, hard process, but a worthwhile one that will differentiate between companies capable of using predictive models and those that are not and are limited to decision-making based on arbitrary issue, intuition or unscientific rules, and that will move from seeing it as a service to a commodity or a utility… for those that have done their homework.



(En espa?ol, aquí)

Bernardo Riveira

Director of Operations & Development (COO) at Cinfo Company

6 年

And now in europe, under GDPR, the owner of the data can ask for explicit explanation of why she has been automatically categorized this or that way, not only in credit and loans check apps, but all the other uses of algorithms that take our data and generate new info. Something quite difficult to do with a black box algorithm implemented with a deep neural net, for example.

回复

要查看或添加评论,请登录

Enrique Dans的更多文章

  • El desastre del software y la automoción

    El desastre del software y la automoción

    GM se ve obligada a detener temporalmente las ventas de su Chevy Blazer EV después de detectar un sinnúmero de…

    11 条评论
  • El enésimo drama de la automoción tradicional: la interfaz

    El enésimo drama de la automoción tradicional: la interfaz

    Porsche acaba de anunciar que se une a toda la legión de empresas de automoción tradicionales y renuncia a tener una…

  • Poniendo a prueba a ChatGPT: consultores centauros o cyborgs

    Poniendo a prueba a ChatGPT: consultores centauros o cyborgs

    Un working paper de Harvard, ?Navigating the jagged technological frontier: field experimental evidence of the effects…

    12 条评论
  • Suscripciones, tramos… y spam

    Suscripciones, tramos… y spam

    Elon Musk confirma sus intenciones de convertir la antigua Twitter, ahora X, en un complejo entramado de suscripciones…

  • El código abierto y sus límites

    El código abierto y sus límites

    Sin duda, el código abierto es la forma más ventajosa de crear software: cuando un proyecto de software toma la forma…

  • La gran expansión china

    La gran expansión china

    El ranking de apps más descargadas en el mundo en iOS y Android para el mes de septiembre de 2023 elaborado por…

    1 条评论
  • Starlink y las torres de telefonía en el espacio

    Starlink y las torres de telefonía en el espacio

    Starlink remodela su página web y a?ade una oferta de internet, voz y datos para smartphones provistos de conectividad…

    3 条评论
  • La fotografía con trampa

    La fotografía con trampa

    La presentación de los nuevos smartphones de Google, Pixel 8 y Pixel 8 Pro, y fundamentalmente de las funcionalidades…

  • Las consecuencias de reprimir los procesos de innovación

    Las consecuencias de reprimir los procesos de innovación

    Mi columna de esta semana en Invertia se titula ?El mercado de trabajo y la innovación? (pdf), y previene sobre los…

  • We are on the verge of the most dangerous election in history

    We are on the verge of the most dangerous election in history

    In just a few days, on November 3rd, the US presidential elections will take place, the most dangerous in history, and…

    2 条评论

社区洞察

其他会员也浏览了