Twelve Labs is building models that can understand videos at a deep level

Twelve Labs, a San Francisco-based startup, is training AI models to solve complex video-language alignment problems, aiming to create an infrastructure for multimodal video understanding. The models map natural language to video content, enabling developers to create apps that can search through videos, classify scenes, extract topics, and more. The technology can be used for ad insertion, content moderation, media analytics, and automatic generation of highlight reels, among other applications.

The company is unveiling Pegasus-1, a new multimodal model for whole-video analysis. Since its private beta launch in May, Twelve Labs has grown to 17,000 developers and is working with companies across various industries. The company recently closed a $10 million strategic funding round from Nvidia, Intel, and Samsung Next, bringing its total raised to $27 million.

Key takeaways:

Twelve Labs, a San Francisco-based startup, is training AI models to solve complex video-language alignment problems, enabling developers to create apps that can search through videos, classify scenes, and extract topics from within those videos.
The company's technology can be used for ad insertion, content moderation, media analytics, and to automatically generate highlight reels or blog post headlines and tags from videos.
Twelve Labs is unveiling Pegasus-1, a new multimodal model that understands a range of prompts related to whole-video analysis, aiming to provide human-level video comprehension without manual analysis.
The company has recently closed a $10 million strategic funding round from Nvidia, Intel and Samsung Next, bringing its total raised to $27 million.

Twelve Labs is building models that can understand videos at a deep level | TechCrunch

Key takeaways:

Comments (0)

Newsletter