Measuring Hallucinations in RAG Systems

Vectara has announced the release of its open-source Hallucination Evaluation Model (HEM), designed to assess the frequency of hallucinations in generative Language Learning Models (LLMs) used in Retrieval Augmented Generation (RAG) systems. The HEM provides a FICO-like score, helping businesses evaluate the trustworthiness of their RAG systems. The model is aimed at addressing concerns about LLMs generating hallucinations, which can lead to large errors, copyright issues, introduction of false facts, and specific biases.

The company has also created an evaluation scorecard for the most used models, detailing their hallucination rates. The scorecard will be updated regularly as new information becomes available and as LLMs are updated. Vectara plans to incorporate the capabilities of the HEM into its platform, providing factual consistency scores for the answers it provides. The company also plans to develop its own models for summarization that further reduce hallucination rates.

Key takeaways:

Vectara has launched an open-source Hallucination Evaluation Model (HEM) to assess how often a generative LLM hallucinates in Retrieval Augmented Generation (RAG) systems.
The HEM model can help enterprises evaluate the trustworthiness of their RAG system and identify the best LLMs for their specific use cases.
Vectara has also created an evaluation scorecard for the most used models, providing a FICO-like score for hallucinations in RAG systems.
In the future, Vectara plans to integrate the capabilities of the HEM model into their platform and develop their own models for summarization to further reduce hallucination rates.

Measuring Hallucinations in RAG Systems

Key takeaways:

Comments (0)

Newsletter