Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

GitHub - relari-ai/continuous-eval: Open-Source Evaluation for GenAI Application Pipelines

Feb 25, 2024 - github.com
The article introduces `continuous-eval`, an open-source package designed for the granular and comprehensive evaluation of GenAI application pipelines. The package offers modularized evaluation, a comprehensive metric library, user feedback integration, and synthetic dataset generation. It can be installed as a PyPi package or from source, and requires at least one LLM API key to run LLM-based metrics.

The article provides a detailed guide on how to use `continuous-eval`, including running a single metric, defining custom metrics, and running evaluation on pipeline modules. It also lists off-the-shelf metrics available for different modules and categories. The article concludes by providing resources for further learning and information about the project's license and usage-tracking policy.

Key takeaways:

  • 'continuous-eval' is an open-source package designed for detailed and comprehensive evaluation of GenAI application pipelines.
  • It offers features like Modularized Evaluation, Comprehensive Metric Library, User Feedback in Evaluation, and Synthetic Dataset Generation.
  • The code is provided as a PyPi package and requires at least one of the LLM API keys in '.env' to run LLM-based metrics.
  • It allows defining your own metrics by extending the Metric class and implementing the '__call__' method.
View Full Article

Comments (0)

Be the first to comment!