The study also revealed that some models refuse to answer "sensitive" questions more often than others, suggesting that these refusals are influenced by the implicit values of the models and the explicit values and decisions made by the organizations developing them. The researchers found that different models expressed opposing views on topics such as immigrant asylum in Germany and LGBTQ+ rights in Italy, possibly due to biased annotations. The study underscores the importance of rigorously testing AI models for cultural biases before deployment.
Key takeaways:
- Researchers from Carnegie Mellon, the University of Amsterdam and AI startup Hugging Face found that generative AI models respond inconsistently to questions relating to sensitive topics, reflecting biases embedded in the training data.
- The study tested five models, including Meta’s Llama 3 and Alibaba’s Qwen, using a data set containing questions and statements across various topic areas and languages.
- Questions about LGBTQ+ rights triggered the most “refusals” from the models, but questions and statements referring to immigration, social welfare and disability rights also yielded a high number of refusals.
- The researchers call for more rigorous testing of AI models for cultural biases and the implementation of more comprehensive social impact evaluations that go beyond traditional statistical metrics.