The authors also conducted their own test using ChatGPT to grade several thousand LinkedIn profiles of people who had practiced on their interviewing.io platform. They found that while there was a correlation between what ChatGPT said and how the coder performed on a real technical screen, the tool was only slightly better than a random guess. They concluded that while ChatGPT does not appear to have racial bias when judging engineers’ profiles, it’s not particularly good at the task either. They warn that AI solutions are not the magic pill for effective candidate filtering, as they are flawed in the same ways as humans and just do the wrong thing faster and at scale.
Key takeaways:
- The article discusses a study by Bloomberg that claimed OpenAI's GPT showed racial bias in reviewing resumes. However, the authors of the article found that Bloomberg's study did not conduct statistical significance testing, and their own re-analysis of the data showed no racial bias.
- Despite not showing racial bias, the authors found that ChatGPT is not good at judging resumes. It overestimates the importance of candidates' pedigrees, such as whether they've worked at a top company or attended a top school, which can disadvantage candidates from non-traditional backgrounds.
- The authors conducted their own test using ChatGPT to grade LinkedIn profiles of people who have practiced on their platform, interviewing.io. The results showed that ChatGPT's performance was only slightly better than random guessing.
- ChatGPT was found to consistently overestimate the passing rate of engineers with top schools and top companies on their resumes, and underestimate the performance of candidates without those elite credentials. This indicates a bias against non-traditional candidates, which the authors argue is not accurate enough to be codified into AI tools.