The company has also created an evaluation scorecard for the most used models, detailing their hallucination rates. The scorecard will be updated regularly as new information becomes available and as LLMs are updated. Vectara plans to incorporate the capabilities of the HEM into its platform, providing factual consistency scores for the answers it provides. The company also plans to develop its own models for summarization that further reduce hallucination rates.
Key takeaways:
- Vectara has launched an open-source Hallucination Evaluation Model (HEM) to assess how often a generative LLM hallucinates in Retrieval Augmented Generation (RAG) systems.
- The HEM model can help enterprises evaluate the trustworthiness of their RAG system and identify the best LLMs for their specific use cases.
- Vectara has also created an evaluation scorecard for the most used models, providing a FICO-like score for hallucinations in RAG systems.
- In the future, Vectara plans to integrate the capabilities of the HEM model into their platform and develop their own models for summarization to further reduce hallucination rates.