The study also found that users often failed to identify or underestimated the degree of error in the bot's answers, especially when the error was not readily verifiable or required external IDE or documentation. The researchers attribute this to ChatGPT's authoritative style and the use of polite, comprehensive, and textbook-style language. The authors suggest that Stack Overflow could improve by detecting toxicity and negative sentiments in comments and answers, improving the discoverability of their answers, and providing more specific guidelines to help answerers structure their answers.
Key takeaways:
- A study from Purdue University found that OpenAI's ChatGPT produces incorrect answers to software programming questions more than half the time, but its comprehensive and well-articulated responses still manage to convince a third of participants.
- The study also found that users often fail to identify or underestimate the degree of error in ChatGPT's answers unless the error is glaringly obvious.
- ChatGPT's answers are more formal, express more analytic thinking, showcase more efforts towards achieving goals, and exhibit less negative emotion compared to Stack Overflow answers, according to the study's linguistic and sentiment analysis.
- Stack Overflow's traffic has been impacted by the surge of interest in ChatGPT, with an above average traffic decrease observed in April, which could be attributed to developers trying GPT-4 after its release in March.