Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Ask HN: Benchmarks for models other than LLMs

Apr 24, 2024 - news.ycombinator.com
The article discusses the use of benchmarks in evaluating the abilities of LLMs and wonders if similar benchmarks exist for propensity modelling, churn prediction, or other types of models. The author is interested in understanding if there are established best practices for comparing the performance of different models, especially when these models are based on different underlying datasets.

The author seeks to understand the standards for comparing model performance beyond just benchmark data. The discussion is centered around finding a more comprehensive and fair method of comparison that takes into account the unique characteristics of each model's underlying dataset.

Key takeaways:

  • The author has observed impressive benchmarks used for ranking LLMs abilities.
  • The author is curious if similar benchmarks exist for propensity modelling, churn prediction, or other types of models.
  • The author is interested in best practices for comparing model performance beyond just benchmark data.
  • The author acknowledges that different models may have different underlying datasets, which could affect comparisons.
View Full Article

Comments (0)

Be the first to comment!