Criticism on AI in hiring

Criticism on AI in hiring

In this article I would like to comment on a lot of criticism on AI, and the use of modern assessment tooling, in hiring. Because I welcome criticism and we need it, but for some reason it's always about the wrong things. We keep asking the wrong questions and as journalists, and policy makers, aren't asking the right questions, they are getting ignored by the industry. In this article I'll try to explain the questions that are being asked, why they are the wrong questions and what questions we should be asking.

I write this not because I don't like to be closely watched by the media, because I do believe we have such a long way to go and there is so much risk at getting AI and testing in hiring wrong, so I welcome it. I welcome all of it. We just focus on the wrong issues.

So should we have lots of debates on hiring and the use of technology in hiring? Hell YES. So let's question current and future practices. But let's also question current practices, as we can see current outcomes (massive labour market discrimination) and see how they compare shall we? And with current practices I mean pre selection based on cv's and doing phone screens and interviews by people with no psychological training.

And if you are asking questions and criticising, I believe it's best to know what you are talking about when it comes to AI and machine learning AND hiring?

Critique from applicants and journalists

This post was inspired by among others this podcast series from MIT technology review's in Machines we trust, MIT technology review is a publication I hold in high regard, that had so many criticism on AI in hiring, but very few of them valid. Let me break them down one by one before we get to the heart of the issue.

  1. They question why so many (or all) of the different tools use the 'balloon blowing and popping game'. In this game you are asked to inflate baloons and every push you inflate you get 10 cents extra, but a balloon can pop at any time, at random. You can bank and start with a new balloon at any time and it measures you risk aversion. This is probably the best and most researched academically on risk aversion and hence it's used so often. If you want to question this, go and find flaws in the academic literature. But I haven't heard a single argument on why the many academic studies were questionable. The game is questioned while this is probably the best academically researched test out there.
  2. The journalist also said she found that the blue balloons pop a lot less than the other two colours so she was able to bank on that. As I don't know what tool she tried out, I'm unsure if this could be a glitch in the system. But I also question: how many times did she play the games? Was it really the case or just luck? And how did she know as she didn't pump up the yellow and red ones as much, so she cannot know if later in the game these didn't allow her to bank more either? And further more, she obviously didn't talk to the scientists about the science behind the game, as I know it's not about 'banking as much money as you can'. One of the data points measured, and there are more, is about how you respond when a balloon does pop. It's about how much risk you will still take when a risk you just took worked out wrong. And there are many more, scientifically validated, data points being measures. So the journalist obviously didn't even care to inform herself about what is actually measured in the test, didn't care to call an IO psychologist at a university, she just assumed and assumed wrong.
  3. They have a woman saying over and over again: I graduated as a developer, so I'm qualified. Wait, are you saying here there is no bad IT programmer out there with a degree? Are you telling me they are all good? A degree actually tells you very very little about the capabilities of the person. Do I know if this was the right test for her? No, I don't. Should she perhaps have gotten a coding test? Maybe, I'm unsure what these companies try to measure first and without knowing what they measure and why they measure this, there is no way of knowing if she was validly rejected or not.
  4. We have someone saying that not knowing what's being measured is a bad thing. Why is that a bad thing? Because I would like to know. But do you have any science on proving that the results are more accurate when you know? I'm pretty sure you won't be able to find any academic evidence for that and you might find some that point in the other direction. I do know if the balloon test was well known on how it worked, people would be able to game it and it would loose its validity. And this goes for several other tests as well. Then we could be reverting back to the questionnaires and ask someone 'are you tenacious'? Do you think anybody ever answered no on that?

Am I saying there isn't anything wrong with all of this? Absolutely not. But here we have journalists that aren't talking to any independent neuro- or IO psychologists (the people who test these games on an academic level) and make incorrect assumptions about how they work and base their conclusions on their own feelings rather than science. So they are doing exactly what this technology is trying to eliminate from the selection process.

As for the applicant that is mentioned often. I can't tell if she is doing tests that fit her job. No way for me to know, as they do not mention what suppliers are used and they did not contact the companies that use these tools and ask the only questions that matter:

  1. Can you give me the job analysis on the traits that are important
  2. Can you show me the science behind the tests to see how they measure these traits.

Critique from AI scientists

I've also heard lots of criticism from AI scientists. Non of which seem to ever find it relevant to look at the science of hiring. Now I love talking to AI scientists as they know so much more about this field than I ever will. I love to have good, deep conversations with them and I have with several in the past, enlightening each other on our respective fields, AI and the science of hiring someone. But the once I talk to never talk in the media about AI in hiring as they know they do not know enough about the science of hiring. And AI in selection is at the crossroads of these two fields of expertise.

You need to understand both AI and the science of hiring to understand AI in hiring

I see a lot of criticism from AI scientists that is valid if AI, or even machine learning, was implemented they think it is. The problem is, it almost never gets implemented the way they believe it is. That doesn't mean it's not sometimes done terribly, but if you criticise something that wasn't done, people will go like: oh we didn't do that, so we're fine. But often what they did do might have been worse.

Not questioned

Whats more interesting to watch is the things that are taken as' truths' and not being questioned as predictors of success or problems in hiring:

  • cv's
  • (unstructured) interviews
  • unstructured references (in many countries it's legally forbidden to be negative in them)

Every time I hear someone say: just look at my cv I ask: can you show me a single academic paper that was able to even show a single correlation between job performance and a cv? I can tell you, after many years I'm still looking and I haven't had a single person that advocates cv's send me one.

I'm still waiting for anyone to send me a single academic paper showing resumes have any predictive value

And if you are talking about cv's and interviews, that would mean we don't have any bias or discrimination in the labor market until the 2000's, as we've only started using tools the past two decades. Oh wait, aren't most positions held by white men? Isn't there a huge gender pay gap? Aren't muslims at a 30% to 50% lower chance of getting a call back on an application in Europe? Don't Afro Americans have a lot less of a chance of getting a job in the USA? Why? Because of cv's and interviews. Because of humans. So stop saying that's nirvana.

What we should question?

Is this new technology amazing and flawless? Oh Hell NO

The problem with the critique we are having now is that nobody is listening because it's the wrong critique. Questioning what the balloon game measures and not knowing how it measures it is like questioning vaccins because you just don't understand them. It doesn't help address the real (potential) problems.

What questions should we be asking? There are three categories where stuff can go wrong.

First: the science behind the tests.

Second: The traits that are measured for the job at hand

Third: The tests that are chosen to measure the traits for the job.

Let me show what could (and does) go wrong in all three categories.

The science behind the tests

The science behind the tests differs a lot between different test suppliers. They all claim scientific validation, but only if you know what they are talking about can you actually see if their claims are true.

From what I've seen, the quality differs a lot. But you do need someone who knows their stuff to check this out. Some are good, some are less good, some are just total crap.

There are some very bad actors in the world, like this documentary on a German company called Retorio showed: changing your background to a book case gained 'intelligence points' and the company even said: of course it does, that's how humans perceive it.

The traits measures at the job at hand

The biggest issue is: what do we actually measure and is that important? Those questions take a deep understanding of the job at hand, but this is essential.

How do we do a job analysis? How do we decide what qualities are necessary for the job? Who decides what traits are and aren't relevant? What data do we base this on?

If we are using data to do the job analysis, how do we prevent previous bias to get into the algorithm?

What tests do we choose to measure the traits?

And finally there is the tests we use to measure these traits. Much of the criticism in the previously discussed podcast was about this point and if the tests were valid. While the question that should have been asked was if the traits measured were valid AND if the tests measured those traits AND if they did so in an unbiassed way.

To give you an example where this last piece can go terribly wrong let me give you a conversation I recently had with a recruiter who had no money to hire an independent expert like me to help her (as it was her task according to senior leadership, she was the recruiter so she should know assessments....) but had no clue to what she was doing.

She asked me what I thought of supplier X. I told her I knew X and they weren't bad and they had some great case studies on selecting lawyers. She told me she loved their product and wanted to use it for her business: security guards. I asked her if she did a job analysis on her security guard job, the answer was no. So I asked her why she thought her security guards shared important traits with lawyers and she said: the supplier said the technology is job agnostic. Technically the supplier didn't lie, but it does matter on what you want to measure if it's valid or not. I told her that from what I know of the job for security guards it's probably more about cognitive traits like attention for detail, while lawyers are more about psychometric traits, and you measure those in different ways with different tests. So it's job agnostic for jobs who's primary differentiators are psychometric.

I then asked her if she realised their assessment was based on questionnaires and she said yes. I asked her if she realised that their questionnaire was at what we Dutch call B3 level, that's the language level correlating with a masters degree and that knowing the make up of their current team with lots of people from bi-cultural backgrounds who speak Dutch as a second language and most of them were at A3 level (that's 3 levels down, correlating with finishing primary school). So they probably wouldn't understand the questions and hence just fill something out. She told me they only require A3 level Dutch for the job. When I asked her if she questioned the supplier if their test was on A3 level she never though of that...

So in this last mile there's a lot to be gained as well. Even if the test is valid, if you're using the wrong measurement for your audience, you will still go wrong. Again questions that aren't asked.

Conclusion

Getting AI or testing in hiring right is a science. I believe in criticising it and I love people asking questions and poking it. But please, ask the right questions. And ask them to the right people.

  1. How did you ascribe the traits to the job? How was the job analysis done?
  2. What's the science behind the tools. Is the science done right?
  3. Does the tool actually measure the traits and does it take into account all the other factors in play as well that might be relevant?

And last but not least: does it lead to higher quality of hire and a fairer process? Because increasing the quality of hire, everywhere I've seen that done, has increased diversity as well.

Start asking the right questions


Geen alternatieve tekst opgegeven voor deze afbeelding

Bas is a professional snoop in the world of recruitment and HR technology. He's tested dozens different modern assessment tools and is willing to test any that will let him. He's also been researching the Digital candidate experience in the Netherlands for over 15 years.

He organises events like?Digitaal-Werven ?(Dutch) and?webinars .

Marcel Leeman

Co-Founder & CEO @ Textmetrics

9 个月

I am indeed often wondering if we are asking the right questions when it comes to this topic. This was an interesting read and I am curious what future policies are awaiting us.

回复
Barb Hyman

Founder & CEO Sapia.ai. Building a fairer world through ethical AI

3 年
Brett Morris

Cofounder at WhoHire SaaS | Strategic advisor on spatial software to Stonepunk Studios, a global innovator in VR.

3 年

Lots of good thinking to consider here Bas. Central to it ALL is your question: what do we actually measure and is important, ie. is it actually relevant to performance to the job an individual is being hired for? Many, probably most of the activities, measures, assessments used in hiring processes have zero correlation with actual performance in the job being hired for, irrespective of whether the underlying tech supporting the tool has an Ai component or not.

Robert Newry

Uncovering Human Potential with Cognitive Neuroscience | Scrap the CV advocate | AI in Assessment | Social Mobility champion | Psychometric testing

3 年

Bas - I love your very open and frank challenge to those that critisize AI or in fact any new approach in hiring to think more carefully about whether they are asking the right questions - which then define very clearly. Above all I appreciate your openness to innovation as long as its done in the right way. It's great to have another perspective on this important issue from someone who has no vested interest in a particular test or approach.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了