Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Patronus AI secures $17M to tackle AI hallucinations and copyright violations, fuel enterprise adoption

May 22, 2024 - venturebeat.com
San Francisco-based startup Patronus AI has raised $17 million in a Series A funding round to develop its automated evaluation platform for large language models (LLMs). The platform uses proprietary AI to identify errors in LLM outputs, such as hallucinations, copyright infringement, and safety violations. The funding round was led by Notable Capital and included participation from Lightspeed Venture Partners, former DoorDash executive Gokul Rajaram, Factorial Capital, and several unnamed tech executives.

Patronus AI's research has revealed significant deficiencies in leading LLMs' ability to accurately answer fact-based questions. The company's "FinanceBench" benchmark found that the best performing model answered only 19% of financial queries correctly after reading an entire annual report. Another experiment with the company's "CopyrightCatcher" API found that open-source LLMs reproduced copyrighted text verbatim in 44% of outputs. With the new funding, Patronus AI plans to expand its research, engineering, and sales teams and develop additional industry benchmarks.

Key takeaways:

  • Patronus AI, a San Francisco startup, has raised $17 million in Series A funding to develop an automated evaluation platform that can detect errors in large language models (LLMs).
  • The platform uses proprietary AI to identify issues such as hallucinations, copyright infringement, and safety violations in LLM outputs.
  • Patronus AI's research has revealed significant deficiencies in leading models' ability to accurately answer questions grounded in fact, with the best performing model answering only 19% of financial queries correctly.
  • With the new funding, Patronus plans to expand its research, engineering, and sales teams and develop additional industry benchmarks, aiming to make automated evaluation of LLMs a standard requirement for enterprises deploying the technology.
View Full Article

Comments (0)

Be the first to comment!