Intro to Bias in AI
beluis3d.medium.comIn jobdescription.ai [0] I have the challenge of making the job descriptions gender-neutral. I have tested ten job descriptions with Jobvite tool, and the results showed zero biased, but then I started researching more about the gender bias tools and found one study and an article about gender bias [1,2].
Study [1] suggests that men and women will decode wording differently. For instance, women felt that job adverts with masculine-coded language were less appealing and that they belonged less in those occupations. Some masculine-coded words are challenging and lead while some feminine-coded words are support and commitment.
That does not mean to imply that men lack the ability to be supportive or collaborative, nor women lack leadership or challenging skills “But, based on data analytics on the kinds of jobs men and women apply for, research shows that the adjectives matter.”
Article [2] supports the study and added "Many women won’t apply for a job unless they meet almost all of the listed requirements" so the list of requirements matter as well.
I plan to research more to better understand the gender bias in terms of wordings before implementing tools to create a feedback loop to improve the algorithm.
[0] https://www.jobdescription.ai
[1] http://gender-decoder.katmatfield.com/static/documents/Gauch...
[2] https://www.forbes.com/sites/hbsworkingknowledge/2016/12/14/...
edit: to provide more information instead of links with no context
In the current age of post-truth this may be considered a heresy, but how about you know...
Telling the truth instead of manipulating people? Say that it is challenging if it is. Say that it is supportive if it is? List the requirements that are required?
Who said I want to manipulate the truth? Challenging does not mean the job required a male; the study was about the interpretation instead of the intention.
When I mentioned gender-neutral job descriptions, I referred to ads to balance using this gender-coded language instead of eliminating any of them. Bias towards any gender is gender bias.
And I agree with your point regarding requirements, that's why the article mentioned writing the actual requirements instead of a long list of things that are not in that job.
Maybe it helps to not think about people as statistics. You could write something like "Do you doubt you are right for this job? Please apply anyways, we'd like to know about you!". The problem with that, of course, is that you cannot meet every applicant. So after all, maybe it's not the wording but the position's description that's the problem.
Isn't the real problem in recruiting that a ton of people who have nothing do to with the job requirements apply?
"Farmer, 45 years old, never finished high school" applies to image processing engineer position.
Is this a common issue for you ?
In my experience, the most common 'don't match the job requirement' issue is mostly about lack of experience (e.g. out of school and applying to senior position).
And in this case I don't really blame them for trying
I was exaggerating, but yes, there are also big mismatches. I think this tends to happen more with big job sites.
This is more of a problem with men than women, in my experience.
I've even toyed with the idea of compensating for this by having two sets of job ads -- one with very high requirements, and one with just the bare minimum. Then accept applicants for interview from each based on the applicant's statistical propensity to exaggerate their abilities.
What do you mean by "statistical propensity to exaggerate their abilities"? How can you measure that without already knowing enough to just select applicants directly without considering which ad they responded to or who exaggerated more?
There will be a great number of well-qualified women who do not respond at all to more demanding ads. Less demanding ads catch these, but also catch a large number of under-qualified men.
It's a precision-recall problem, and by selecting differently in the two situations I speculate that one could get more well-qualified candidates with less effort.
But how will you distinguish the well qualified female applications from the large number of under-qualified men? Surely not by their sex which is probably illegal and certainly unethical.
The legal question is definitely interesting, but I don't see how it would be unethical to use the most powerful set of signals available. The situation seems to me a bit similar to how grocery stores and cafes package what's essentially the same product under a large range of price points to let the more price-sensitive customers self-select.
If I find anything unethical, it's knowingly continuing the practise of writing job postings that put one group of people at a clear and documented disadvantage.
Or you could put out an honest job listing based on what you want and stop bamboozling yourself with sexist assumptions.
And what is the difference between wording and position's description?
Wording as it is described here seems to be meaningless.
HR persons in general cannot understand the position's requirements. That's why they cling so much on technical details. "What tools do you use? Clang 10.0.1? Ok!" - "Position requires at least 5 years of experience with Clang 10.0.1"
Anything about "challenging", "commitment", etc. are just meaningless, interchangeable fill words. Otherwise, no HR person would dare changing them. I tend to think about them in the same way as fonts.
Is this work for some sort of legal compliance reason or just to be more fair or effective? If the latter, perhaps defining the groups as genders is handicapping yourself. A lot of the things you mentioned are more like personality differences that happen to be correlated with gender. I guess it's not up to you to change the scope of your work but more accurate people-types than genders sounds like it would be more ethical.
This comment is motivated by personal experience. I'm a man but I used to very much "won’t apply for a job unless [I] meet almost all of the listed requirements". It felt great early on because I almost always got hired for any job I applied for! I'd guess that the vast majority of jobs I've applied for in my entire life have at least reached interview stage. But I also haven't got far in my career and fear it may be over for good now because I was too cautious and underconfident. Only recently, I've learned to disregard all the "preferred" criteria and apply to interesting jobs if I have all of the "required" or "must" criteria. But now I wonder if even that's being too strict.
I am doing it to be fair, effective and help companies to improve their job descriptions. But I haven't decided yet which approach to take, mainly because, as you mentioned, these points could be just personality differences and I appreciate that the study I mentioned might not have the complete picture, so I need to research more.
As for your point regarding "meeting all requirements", I can relate to it. This supports the theory of listing the absolute requirements to get more applications instead of copy/pasting everything from other job descriptions. This improvement in itself makes it a win-win situation.
Bias of various forms in the datasets we use can absolutely be a big issue, and this is a pretty good summary of some of those areas. However, I think it's important to look beyond just the data and also look into the assumptions and choices we make regarding models, performance metrics, etc.
I came across a good Twitter thread[1] explaining some of these other types of bias -- a lot of them come down to various ways in which model decisions end up impacting performance on the "long tail" of data (i.e., the less frequent categories and groups) long before they impact the bulk of the distribution. This means overall performance may be minimally impacted (or even improved), but performance for subgroups can be drastically reduced.
Anyway, the thread is definitely worth a read, and it links to many sources for further reading.
[1] https://twitter.com/sarahookr/status/1361373527861915648
Completely agree. There are three primary forms of bias: Human Bias, Data Bias, and Algorithmic Bias. A better solution is to improve existing machine learning models with existing solutions (e.g., domain adaptation, domain generalization, discovering latent domains). This will improve the overall performance.