Arthur releases open source tool to help companies find the best LLM for a job

Arthur, a machine learning monitoring startup, has launched Arthur Bench, an open-source tool designed to help users identify the most suitable large language model (LLM) for their specific data set. The tool allows users to test and measure the performance of different prompts against various LLMs, enabling them to make informed decisions about which model best suits their needs.

In addition to the open-source version, Arthur will also offer a SaaS version for customers with larger test requirements or those who prefer not to manage the open-source version. This follows the release of Arthur Shield in May, an LLM firewall designed to detect model hallucinations and protect against toxic information and data leaks.

Key takeaways

Arthur, a machine learning monitoring startup, is releasing Arthur Bench, an open source tool designed to help users find the best LLM for a specific set of data.
The tool allows users to test and measure how different prompts will perform against different LLMs, potentially helping them make better decisions on which model is best for their particular use case.
Arthur Bench will also be available as a SaaS version for customers who don't want to manage the open source version or have larger test requirements.
This follows the release of Arthur Shield in May, a tool designed to detect hallucinations in models while protecting against toxic information and private data leaks.

Arthur releases open source tool to help companies find the best LLM for a job | TechCrunch

Key takeaways

Discussion (0)