The study suggests that training models in "low precision" could make them more robust. However, precisions lower than 7- or 8-bit may see a noticeable decrease in quality. The researchers believe that more effort will be put into meticulous data curation and filtering, so that only the highest quality data is put into smaller models. They also predict that new architectures that aim to make low precision training stable will be important in the future.
Key takeaways:
- Quantization, a technique used to make AI models more efficient, may have more limitations than previously assumed, particularly when applied to models trained over a long period on lots of data.
- AI model inferencing is often more expensive in aggregate than model training, with the cost of inference being a significant issue in the AI industry.
- Training models in 'low precision' could potentially make them more robust, but precisions lower than 7- or 8-bit may see a noticeable step down in quality.
- There's no free lunch when it comes to reducing inference costs, and the future may see more effort put into meticulous data curation and filtering, as well as new architectures that aim to make low precision training stable.