The study also found that ChatGPT failed to properly explain its reasoning process, a feature that had been present in March but disappeared by June. This lack of transparency was also observed when the chatbot was asked to answer sensitive questions. The researchers concluded that while the technology may have become safer, it also provided less rationale. They emphasized the importance of continuously monitoring the models' performance over time due to these unpredictable effects and changes.
Key takeaways:
- A Stanford University study found that high-profile A.I. chatbot ChatGPT performed worse on certain tasks in June than its March version.
- The study found wild fluctuations, or drift, in the technology's ability to perform certain tasks, with the most notable results coming from research into GPT-4's ability to solve math problems.
- James Zou, a Stanford computer science professor and one of the study’s authors, highlighted the unpredictable effects of changes in one part of the model on others, and the need to continuously monitor the models’ performance over time.
- ChatGPT also stopped explaining its reasoning when answering sensitive questions, making the technology less transparent, according to the researchers.