Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

Inception emerges from stealth with a new type of AI model | TechCrunch

Feb 26, 2025 - techcrunch.com
Inception, a Palo Alto-based company founded by Stanford professor Stefano Ermon, has developed a novel AI model called a diffusion-based large language model (DLM). Unlike traditional large language models (LLMs) that operate sequentially, Inception's DLM leverages diffusion technology to generate and modify large blocks of text in parallel, resulting in significantly faster performance and reduced computing costs. Ermon, who has been researching the application of diffusion models to text, claims that Inception's models can run up to 10 times faster and cost 10 times less than traditional LLMs. The company offers an API, on-premises, and edge device deployment options, and has secured several customers, including Fortune 100 companies, by addressing their need for reduced AI latency and increased speed.

Inception was co-founded by Ermon along with UCLA professor Aditya Grover and Cornell professor Volodymyr Kuleshov. Although Ermon did not disclose funding details, TechCrunch reports that the Mayfield Fund has invested in the company. Inception's DLMs are said to outperform existing models, with their "small" coding model rivaling OpenAI's GPT-4o mini and their "mini" model surpassing small open-source models like Meta's Llama 3.1 8B. The company claims its models can achieve more than 1,000 tokens per second, a significant speed improvement if validated.

Key takeaways:

  • Inception, founded by Stanford professor Stefano Ermon, has developed a diffusion-based large language model (DLM) that claims to offer faster performance and reduced computing costs compared to traditional LLMs.
  • The company’s DLMs can generate and modify large blocks of text in parallel, leveraging GPUs more efficiently and potentially changing the way language models are built.
  • Inception has secured several customers, including Fortune 100 companies, by addressing their need for reduced AI latency and increased speed.
  • The company offers an API, on-premises and edge device deployment options, and claims its models can run up to 10x faster and cost 10x less than traditional LLMs.
View Full Article

Comments (0)

Be the first to comment!