1
Feature Story
Ask HN: Benchmarks for models other than LLMs
Apr 24, 2024 · news.ycombinator.comThe author seeks to understand the standards for comparing model performance beyond just benchmark data. The discussion is centered around finding a more comprehensive and fair method of comparison that takes into account the unique characteristics of each model's underlying dataset.
Key takeaways
- The author has observed impressive benchmarks used for ranking LLMs abilities.
- The author is curious if similar benchmarks exist for propensity modelling, churn prediction, or other types of models.
- The author is interested in best practices for comparing model performance beyond just benchmark data.
- The author acknowledges that different models may have different underlying datasets, which could affect comparisons.