A perfect storm
Aonghus McGovern, PhD.
Using data and analytics to help keep HubSpot and its customers safe.
In AI design it’s important to be aware that one fault often creates others
A couple of weeks ago Algorithm Watch published an article about algorithmic allocation of teachers in Italy. For any vacancy, the algorithm would rank candidates based on factors like CV, location and availability and would then produce a ranking of the most suitable candidates. This was meant to be faster and more efficient than the original process of interviewing candidates in person. Algorithm Watch found that the algorithm has harmed thousands of teachers, such as by ranking candidates lower than less qualified individuals. I think one noteworthy aspect of this case is that the harm is produced by the interaction of multiple distinct yet related components.
Let’s start with the algorithm. As well as the ranking issue, the article explains that if a candidate is unsuccessful for a role they are not released into the pool of available teachers so they cannot be considered for other roles. On top of this, the algorithm is tuned differently by region so different areas will get different behaviour. But before we even consider algorithmic fixes we need to look at the data. The article quotes a teachers’ trade union representative who states teachers may input their data inaccurately because the tool’s interface is difficult to use. So even if the algorithm were functioning perfectly, it would still output bad rankings because its input data is incorrect. We’re not even at the deepest problem yet. The same trade union representative argued that the entire process of hiring teachers needed to be revisited. They stated that teachers may input their information incorrectly because the input interface is difficult to use.
领英推荐
So there we have it: a malfunctioning algorithm built on incorrect data from a broken process. In this single example we can see multiple key factors interacting to produce a harmful result. The teacher application is far from alone in facing these problems. In fact, many of the applications that we see in the news for causing harm face one or more of them. It’s useful to examine applications like these because they show the relationships between different problems. Bad processes and tools yield bad data. Bad algorithms yield bad outcomes. We often hear the adage of a whole being more than the sum of its parts. The logic also applies if the parts are harmful.
Incorrect model outputs are often the tip of the iceberg. The true problems lie deep beneath the surface.