AI Models Like ChatGPT Can’t Analyze SEC Filings, New Study Finds

A study conducted by startup Patronus AI has found that AI models, including the popular ChatGPT, struggle to accurately analyze Securities and Exchange Commission (SEC) filings. The best-performing AI model, OpenAI's GPT-4-Turbo, only achieved a 79% accuracy rate. The study highlights the challenges faced by AI models, particularly in regulated industries like finance, where accuracy and reliability are crucial.

Patronus AI developed a comprehensive test, FinanceBench, consisting of over 10,000 questions and answers drawn from SEC filings. The study evaluated four language models and found significant inaccuracies. Despite some models performing relatively well, the co-founders emphasized the need for continuous improvement in AI models to meet the required standards of accuracy and reliability, especially in regulated industries.

Key takeaways:

AI models, including OpenAI's GPT-4-Turbo, struggle to accurately analyze Securities and Exchange Commission (SEC) filings, achieving only a 79% accuracy rate, according to a study by Patronus AI.
The study highlights the challenges of using AI models in regulated industries like finance, where accuracy and reliability are crucial. The AI models often refused to answer or provided inaccurate information not present in the SEC filings.
Patronus AI developed a comprehensive test, FinanceBench, consisting of over 10,000 questions and answers from SEC filings, to set a "minimum performance standard" for language AI in the financial sector.
Despite the potential of language models like GPT in the finance industry, the co-founders of Patronus AI stress the importance of continuous improvement in AI models to meet the required standards of accuracy and reliability.

AI Models Like ChatGPT Can’t Analyze SEC Filings, New Study Finds

Key takeaways:

Comments (0)

Newsletter