Week 15 of 2024- W152024

The blog post discusses a range of topics, primarily focusing on advancements and issues in the field of artificial intelligence (AI). It mentions how AI is threatening Wall Street jobs and the potential for a one-trillion transistor GPU by the early 2030s. The post also highlights a new bill that would require AI companies to disclose the use of copyrighted art and discusses the diminishing returns of Language Learning Models (LLMs). It also mentions OpenAI's spider problem and the open-sourcing of Airbnb's machine learning feature platform, Chronon.

The post further delves into the use of AI in art restoration, revealing a hidden image behind Vermeer's 'Girl Reading a Letter at an Open Window.' It also discusses a novel approach to training language models called Selective Language Modeling (SLM), which focuses on training only useful tokens, leading to faster and more efficient training. The blog ends with a list of books the author has read and plans to read.

Key takeaways:

The paper introduces a novel approach to training language models called Selective Language Modeling (SLM), where only tokens deemed useful are trained on, rather than all tokens in the training data.
Tokens are evaluated based on a reference model, and only those with high excess loss are used for training, which effectively filters out unhelpful or noisy data.
Results show that this method not only speeds up training by reducing the number of tokens processed (up to 10x faster) but also improves model performance significantly on tasks such as mathematical problem solving.
The approach suggests a potential shift in how training data is handled, focusing on quality and relevance rather than quantity, which could lead to more efficient use of computational resources and better performing models.

Week 15 of 2024- W152024

Key takeaways:

Comments (0)

Newsletter