Artificial intelligence, or AI, tools are great for handling boring tasks. Like going through thousands of job applications and pulling out the most promising ones. So companies often use AI apps to shortlist job applications. But these tools tend to show bias in which applicants they rank as the best fit for a job, a new study finds.
In the study, scientists presented AI models with résumés. A résumé (REZ-oo-may) is a document that lists a person’s education, training and job history. It’s often the first thing a company reviews when judging a job applicant — long before meeting them face-to-face.
Let’s learn about bias
Scanning résumés for traits that might identify top candidates can be very time-consuming. That’s led some companies to turn the first stage of review over to AI. Specifically, they use a type of AI known as a large language model, or LLM. That can save companies time and money. Some people also say it makes the résumé reviews less biased, because the only data an AI app can consider are a candidate’s qualifications.
But a team at the University of Washington wondered if that was really true.
“We wanted to know whether LLMs are fair when they evaluate people’s résumés,” says Kyra Wilson. She is a researcher who studies the performance of AI systems. She and her colleagues asked three state-of-the art AI tools to review résumés and pick the best candidates for different jobs.
Do you have a science question? We can help!
Submit your question here, and we might answer it an upcoming issue of Science News Explores
The team collected more than 500 publicly available résumés and 500 job descriptions for the experiment. These jobs were spread across nine categories. They included heads of companies, sales managers, teachers, designers and more. Altogether, it was possible to make more than 3 million comparisons between these résumés and jobs.
For each job, the researchers gave an identical set of résumés to the AI models. The only differences between the documents were the names on them. When asked to rank the résumés by who was most suited for a job, all three models showed a bias in whom they preferred. It came down to the candidate’s name.
Pekic/E+/Getty Images Plus
AI biases
Names that sounded like those common among white people got top rankings 85 percent of the time. Names that sounded like those common among Black people were favored less than 10 percent of the time.
Nearly 90 percent of the time, the AI models preferred résumés from people with male-sounding names. Unless, that is, those male names were associated with Black people. These candidates were never favored.
Men and white people are generally preferred by the models, Wilson says. “So, one theory is that Black women would be the most harmed by them,” she adds. “However, we found that Black men actually have the worst outcomes when these models are used.” This means that future work to reduce these biases should especially focus on that group.
Most LLM-based AI tools for résumé screening are “closed-source.” This means the public — including researchers — don’t have access to the computer code used to create them. So outsiders can’t see how they work.
“It is well-established that language models exhibit societal biases,” says Marzyeh Ghassemi. She’s a computer scientist and engineer at the Massachusetts Institute of Technology in Cambridge. This paper, she says, identifies some patterns that could be used as a first step in checking the quality of LLM-based screening tools.
Clearly, there’s a need for more work in this space.
When it’s time for you to apply for a job, there’s a good chance that AI will play some role in screening whether you’d be a good fit. Here’s how that’s been changing and why AI is playing such a big role — plus the implications of being evaluated by AI trained on biased materials.