Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Inversion: fast, reliable structured LLMs

Mar 19, 2024 - news.bensbites.co
The article announces the creation of Inversion, a family of structured language models designed to improve speed, reliability, and reasoning in traditional AI systems. The first generation of Inversion models are reported to be up to 100 times faster, with 10 times lower latency, and 10,000 times less overhead than the best alternatives, while also providing 100% reliable structure. The models are designed to handle structured tasks such as extraction and function calling, and they offer deep support for typed JSON output.

The company behind Inversion is expanding access to these first-generation models and has begun developing the next generation, which aims to achieve an inference speed of 100,000 characters per second. The article also hints at future improvements, including processing large input prompts in milliseconds, adapting to each user, generative UI composition, and improved multilingual support. The company aims to ensure that AI remains accessible and beneficial for all, and invites readers to join them on their journey.

Key takeaways:

  • The article announces the creation of Inversion, a family of structured language models designed to solve speed, reliability, and reasoning issues in traditional AI systems. These models are up to 100x faster, have 10x lower latency, and output 100% reliable structure with 10,000x less overhead than the best alternatives.
  • Inversion models are designed to do more with less, using less compute, time, and data to produce outputs of higher quality, reliability, and reasoning. They are particularly effective in structured tasks such as extraction and function calling.
  • The article explains the process of building Inversion, which involved creating a new kind of model that could handle workloads in the real world at scale. This involved processing schemas/grammars in nearly no time, bringing down time-to-first-token to nearly no time, and accelerating inference to over 10,000 char/s.
  • The team is expanding access to the first generation of Inversion models and has begun building the next generation of models targeting on the order of 100,000 char/s inference. They are also working on a fundamentally new class of models for the next generation of Inversion, expecting several orders of magnitude improvement across the board.
View Full Article

Comments (0)

Be the first to comment!