At a time of such profound change, diversity and inclusion matter more than ever. Human decisions about how AI systems are designed and deployed will shape access to work for years to come.
Now is not the time to erode progress. Employers need to do both: rapidly adjust to a changing employment landscape and ensure equality and fairness in recruitment.
Here are the four steps that HR professionals need to take to make sure that equality and fairness is maintained as AI systems are adopted in haste in responding to the pandemic.
Step 1: Be clear on the top priorities
Hiring is a series of small, sequential decisions. Technologies play different roles in this process. Automated predictions can steer ads, personalized recommendations can be targeted at particular demographics, initial assessment can be automated, screened and prioritized, chatbots, virtual interviews and game-based assessments can be used.
AI can reduce reliance on traditional screening, potentially sidelining people’s stereotypes but this is not guaranteed. Ethical evaluation and human-centered implementation are key, as is the relative priority of goals to reduce bias, increase efficiency or decrease risk.
Use this checklist to clarify your goals and priorities.
Step 2: Understand the risk of perpetuating bias from on your side
AI learns from data and data about the world is biased. If historical human bias is reflected in a dataset used to train AI, the AI will likely exhibit the same bias. And if the dataset isn’t representative—or even unbalanced—it will learn to predict things about some groups better than others.
As an HR professional who is looking to use AI for hiring, it’s your responsibility to deeply understand the data that are relevant to your choices – historical data, new data requirements, the groups within the data and the consequences of different decisions on different groups. The process of mitigating bias shouldn’t be left up to a vendor – it needs to be actively managed and governed by the people who know the most about your unique data.
Some hiring algorithms assess candidates’ suitability and personality based on videos and games, and by analyzing body language, speech patterns, mouse movements, eye tracking, tonality, emotional engagement and expressiveness. Hundreds of thousands of data points are gathered in a half hour interview or online game-playing exercise.
This means that it’s important to understand how the AI learns and on what basis it classifies different behaviors, measures, data points and features. There have been serious questions about the science behind many systems – while AI is excellent at finding correlation, it isn’t able to establish causality. Many such associations have questionable causal links to actual job performance. As an HR professional, you should expect to have to explain and justify the basis for decisions. This means understanding how the AI makes decisions and how people make decisions as a result. Human factors such as automation bias come into play. People can be unduly influenced by computerized recommendations and a false sense of precision and objectivity. Even small differences in score, for example, can make a big difference in how people perceive a ranked list.
Use this tool to screen how much bias your company may bring to the AI development and implementation process and to assess the risk.
Step 3: Assess vendors
The most common assessment types are questions, algorithmically-viewed video analysis and game play. Video and gaming capture data outside of human perception and provide rich data sources for AI analysis.
We keep track of more than a dozen vendors, such as Hirevue and pymetrics. We focus most on those that use novel methods that rely on AI that operates outside of human awareness, such as video interviews and gaming.
Many vendors claim to help companies reduce bias by promising precision and specificity. Paradoxically, these claims have resulted in people believing vague claims of de-biasing and fairness. Without access to models or the data that the models operate with, it’s very difficult to assess vendor claims.
Almost all vendors are less than clear on their validation methods. It will be up to you to ask the right questions – how is data selected for validation, how does validation stay valid over time, how will validation be tailored for your unique needs?
Bias testing in hiring tools is almost always opaque and performed internally by companies. There is no external validation which means it’s very difficult to verify any vendor’s claims. This means that, as a customer, it’s important to place pressure on vendors to more explicitly test and validate and to develop enhanced de-biasing features over time. But don’t be fooled by “de-biasing” as a technical step. Because so many protected features are entwined with predictors of success, even the best technical de-biasing may not be enough to promote equity.
The key factors to consider are the type of assessment offered, the training data and prediction targets, the validation claim and references to bias or de-biasing.
Use this list of questions to assess vendors.
Step 4: organizing for audit
AI can be used to source candidates, screen candidates, to determine which candidates to invite for an interview and to inform selection. AI systems used in any one of these four stages can introduce bias, unfairness, and discrimination. For this reason many auditing tools are designed to present findings which are easy for a human to interpret. In the context of AI, this is called interpretability or explainability.
The good news is that there are tools designed specifically for this task.
The bad news is that they are not easy to use unless you’re a data scientist and, even then, are easily misused. Which means that, as the leader responsible, you will fly blind without preparing for audit at the start.
There are more than a dozen different auditing tools, some provided by vendors and integrated into their systems. They require expertise to use and interpret.
The worse news is that there is no way to automate fairness. There are several crucial limitations to existing tools. Some of these are about the definitions applied by the tools themselves, but the biggest gap is the limit of auditing without a broader ethical framework for evaluating equality. There are upwards of 20 different statistical definitions of fairness and many are mutually unresolvable; that is, you can’t have them all.
But back to the good news.
It is possible to both be fast in deployment and fair in implementation. With the right group of people and the right process, important choices can be made quickly and efficiently. These include decisions around:
- data—for example, what limits do you want to place on data gathering
- model training—for example, how frequently do you want to retrain or do you want to use a pre-trained model
- target selection—for example, how do you want to define “cultural fit”
- acceptable failure modes—or example, do you want to miss good candidates more than you want to avoid being inundated by too many bad ones
- fairness criteria—for example, do you want to test sensitive groups by group fairness or treatment equality
Where we can help
We help with vendor assessment, selection and adoption of AI-powered recruitment systems with a focus on bias, fairness and equality. Our expertise is based on years of experience in understanding how AI works, at both a technical level but also the human response.