Study finds that AI models hold opposing views on controversial topics

A study presented at the 2024 ACM Fairness, Accountability and Transparency (FAccT) conference revealed that generative AI models respond inconsistently to questions on polarizing topics, reflecting biases in their training data. Researchers from Carnegie Mellon, the University of Amsterdam, and AI startup Hugging Face tested several models, including Meta’s Llama 3, on topics such as LGBTQ+ rights, social welfare, and surrogacy. They found significant discrepancies in how models from different regions handle sensitive topics, with the values conveyed by model responses varying depending on culture and language.

The study also revealed that some models refuse to answer "sensitive" questions more often than others, suggesting that these refusals are influenced by the implicit values of the models and the explicit values and decisions made by the organizations developing them. The researchers found that different models expressed opposing views on topics such as immigrant asylum in Germany and LGBTQ+ rights in Italy, possibly due to biased annotations. The study underscores the importance of rigorously testing AI models for cultural biases before deployment.

Key takeaways:

Researchers from Carnegie Mellon, the University of Amsterdam and AI startup Hugging Face found that generative AI models respond inconsistently to questions relating to sensitive topics, reflecting biases embedded in the training data.
The study tested five models, including Meta’s Llama 3 and Alibaba’s Qwen, using a data set containing questions and statements across various topic areas and languages.
Questions about LGBTQ+ rights triggered the most “refusals” from the models, but questions and statements referring to immigration, social welfare and disability rights also yielded a high number of refusals.
The researchers call for more rigorous testing of AI models for cultural biases and the implementation of more comprehensive social impact evaluations that go beyond traditional statistical metrics.

Study finds that AI models hold opposing views on controversial topics | TechCrunch

Key takeaways:

Comments (0)

Newsletter