Launch HN: Relari (YC W24) – Identify the root cause of problems in LLM apps

The founders of Relari have developed continuous-eval, an evaluation framework for GenAI systems that allows for component-level testing. The tool was created in response to the complexity they encountered while building a banking copilot, where each added component increased the difficulty of ensuring reliability. Continuous-eval allows users to programmatically describe their pipeline and modules, and select metrics for each module. The tool has been used by various companies to test complex pipelines such as finance copilots and enterprise search.

Continuous-eval also features ensemble metrics that predict user feedback, providing developers with a feedback loop from production data to offline testing and development. The founders also emphasize the importance of using a diverse dataset for evaluation, and offer a synthetic data generation pipeline to help users get started quickly. They are seeking feedback on their modular framework, user feedback leveraging, and testing with synthetic data.

Key takeaways:

Relari has developed continuous-eval, an evaluation framework that allows for testing of GenAI systems at the component level, making it easier to identify and address issues.
Continuous-eval allows users to programmatically describe their pipeline and modules, and select metrics for each module, with 30+ metrics developed to cover various aspects of GenAI pipelines.
Relari's system also includes ensemble metrics that predict user feedback, providing developers with a feedback loop from production data to offline testing and development.
Relari also offers a synthetic data generation pipeline to help users get started quickly and make the most out of evaluation, emphasizing the importance of using a diverse dataset for comprehensive and consistent assessment.

Launch HN: Relari (YC W24) – Identify the root cause of problems in LLM apps

Key takeaways:

Comments (0)

Newsletter