The updated benchmark will now be updated when PRs with new performance numbers are provided. The PR should include a description of the changes to a solution script or a version update and new entries in the `time.csv` and `logs.csv` files. The entries will be verified using a different c6id.metal instance, and if there is limited variance, the PR will be merged and the results will be updated. DuckDB is currently the fastest library for both join and group by queries at almost every data size.
Key takeaways:
- The H2O.ai db-benchmark has been updated with new results and the AWS EC2 instance used for benchmarking has been changed to a c6id.metal for improved repeatability and fairness across libraries.
- DuckDB is the fastest library for both join and group by queries at almost every data size.
- The benchmark was re-run on a c6id.metal machine to avoid issues with network storage and noisy neighbors that were present in the previous setup.
- Moving forward, the benchmark will be updated when PRs with new performance numbers are provided, and these entries will be verified using a different c6id.metal instance.