Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

RWKV: Reinventing RNNs for the Transformer Era — with Eugene Cheah of UIlicious

Sep 03, 2023 - latent.space
The article discusses the emergence of Receptance Weighted Key Value (RWKV) models as a potential challenger to the dominance of Transformers in the field of large language models (LLMs). RWKV models, inspired by a 2021 paper on Attention Free Transformers from Apple, are noted for their scalability in training and inference, outperforming Transformers-based open source models. The RWKV project is a distributed, international, and mostly uncredentialed community, reminiscent of early 2020s Eleuther AI, and is driven by the needs of its community, making it extremely polyglot.

The article also highlights an interview with RWKV committee member Eugene Cheah, discussing various aspects of RWKV models, their advantages, and their challenges. The RWKV models are not without weaknesses, as they are sensitive to prompt formatting and perform poorly at lookback tasks. However, they are seen as a credible challenge to Transformers, especially in terms of their scalability and performance on standard reasoning benchmarks.

Key takeaways:

  • The podcast discusses the international, uncredentialed community pursuing the "room temperature superconductor" of Large Language Models (LLMs) - the scalability of Transformers, without the quadratic cost.
  • The most significant challenger to emerge this year has been RWKV - Receptance Weighted Key Value models, which revive the RNN for GPT-class LLMs, inspired by a 2021 paper on Attention Free Transformers from Apple.
  • RWKV models tend to scale in all directions (both in training and inference) much better than Transformers-based open source models, while remaining competitive on standard reasoning benchmarks.
  • The RWKV project is a distributed, international, mostly uncredentialed community reminiscent of early 2020s Eleuther AI, primarily a Discord, pseudonymous, GPU-poor volunteer community.
View Full Article

Comments (0)

Be the first to comment!