The researchers found that while the LLMs were not openly racist, they covertly associated African Americans with negative attributes based on their dialect. This covert prejudice was higher in LLMs trained with human feedback, with the discrepancy between overt and covert racism most pronounced in OpenAI’s GPT-3.5 and GPT-4 models. The authors concluded that these findings reflect inconsistent attitudes about race in the U.S and present the possibility that African Americans could be harmed even more by dialect prejudice in LLMs in the future.
Key takeaways:
- A new study found that large language models (LLMs) from OpenAI, Meta, and Google, including multiple versions of ChatGPT, can exhibit covert racism against African Americans based on their dialect.
- The LLMs were found to be less likely to associate speakers of African American English with a wide range of jobs and more likely to pair them with jobs that don’t require a university degree.
- The AI models were also found to have a higher rate of conviction for people who spoke African American English when asked hypothetical questions about criminality.
- The study also found that these language models have learned to hide their racism, associating African Americans with positive attributes when asked directly, but covertly associating them with negative attributes based on their dialect.