Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Ask HN: Do LLMs get "better" with more processing power and or time per request?

Feb 25, 2024 - news.ycombinator.com
The article emphasizes that more processing power does not necessarily improve a model's performance. It states that models can be trained on CPUs with the same results, given the same model architecture and dataset, albeit taking a longer time. The effectiveness of a model is determined by how well the dataset aligns with the model architecture and the time (epochs) allotted for it to achieve a reasonably accurate prediction ratio, such as 90%.

The author mentions that for image classification models, approximately 100 epochs for 10,000 items seem to yield the best results for certain datasets. However, there is a point where continued training may lead to underfitting or overfitting, and no amount of additional training or processing power can enhance the model's performance.

Key takeaways:

  • More processing power does not necessarily improve a model, as models can be trained on CPUs with the same results, albeit at a slower pace.
  • The quality of a model is determined by how well the dataset fits the model architecture and the amount of time it has been given to reach a semi-accurate prediction ratio.
  • For image classification models, around 100 epochs for 10,000 items seems to be the optimal point for certain datasets.
  • There is a point where continued training of the model results in underfitting or overfitting, and no amount of additional training or processing power can improve it.
View Full Article

Comments (0)

Be the first to comment!