benchmark/README.md at main · getomni-ai/benchmark

The Omni OCR Benchmark is a tool designed to evaluate and compare the OCR and data extraction capabilities of various large multimodal models, including gpt-4o. It aims to provide a comprehensive benchmark of OCR accuracy across traditional OCR providers and multimodal language models, focusing on both text and JSON extraction accuracy. The benchmark uses open-source evaluation datasets and methodologies, encouraging expansion to include additional providers. Key evaluation metrics include JSON accuracy, measured by a modified json-diff, and text similarity, assessed using Levenshtein distance.

The benchmark process involves cloning the repository, preparing test data, setting up API keys for the models to be tested, and running the benchmark to generate results. Supported models include both closed-source and open-source LLMs, as well as cloud OCR providers, each requiring specific environment variables for operation. The project also offers a benchmark dashboard for visualizing test results and is licensed under the MIT License.

Key takeaways:

The Omni OCR Benchmark is a tool for comparing OCR and data extraction capabilities of various large multimodal models, focusing on text and JSON extraction accuracy.
The benchmark uses open-source evaluation datasets and methodologies, encouraging expansion to include additional providers.
JSON accuracy is measured using a modified json-diff, while text similarity is evaluated using Levenshtein distance.
The benchmark supports both open-source and closed-source LLMs, as well as cloud OCR providers, with specific models and required environment variables listed for each.

benchmark/README.md at main · getomni-ai/benchmark

Key takeaways:

Comments (0)

Newsletter