Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Updates to the H2O.ai db-benchmark!

Nov 07, 2023 - duckdb.org
DuckDB Labs has updated the H2O.ai db-benchmark with new results and changed the AWS EC2 instance used for benchmarking to a c6id.metal for improved repeatability and fairness across libraries. The update was initially planned to coincide with every DuckDB release, but the previous setup was found to be unfair to all solutions due to issues with network storage and noisy neighbors. The new c6id.metal machine negates these problems as it is a metal instance with local storage, providing physical hardware not shared with any other AWS users/instances.

The updated benchmark will now be updated when PRs with new performance numbers are provided. The PR should include a description of the changes to a solution script or a version update and new entries in the `time.csv` and `logs.csv` files. The entries will be verified using a different c6id.metal instance, and if there is limited variance, the PR will be merged and the results will be updated. DuckDB is currently the fastest library for both join and group by queries at almost every data size.

Key takeaways:

  • The H2O.ai db-benchmark has been updated with new results and the AWS EC2 instance used for benchmarking has been changed to a c6id.metal for improved repeatability and fairness across libraries.
  • DuckDB is the fastest library for both join and group by queries at almost every data size.
  • The benchmark was re-run on a c6id.metal machine to avoid issues with network storage and noisy neighbors that were present in the previous setup.
  • Moving forward, the benchmark will be updated when PRs with new performance numbers are provided, and these entries will be verified using a different c6id.metal instance.
View Full Article

Comments (0)

Be the first to comment!