Google’s Lumiere brings AI video closer to real than unreal

Google has developed a new AI video generation model called Lumiere, which uses a diffusion model known as Space-Time-U-Net (STUNet) to create videos. Unlike other models that stitch together generated key frames, Lumiere creates a base frame and then uses STUNet to approximate where objects within the frame will move, creating more frames that flow into each other for seamless motion. This method allows Lumiere to generate 80 frames, compared to 25 frames from Stable Video Diffusion, and creates videos that are more realistic than those from competitors like Runway and Meta's Emu.

However, Google's Lumiere is not yet available for testing. In addition to text-to-video generation, Lumiere will also allow for image-to-video generation, stylized generation, cinemagraphs, and inpainting. Despite the potential for misuse in creating fake or harmful content, Google believes it is crucial to develop and apply tools for detecting biases and malicious use cases to ensure safe and fair use, although the paper's authors did not explain how this can be achieved.

Key takeaways:

Google's new AI model Lumiere uses a diffusion model called Space-Time-U-Net (STUNet) to generate videos, creating seamless motion by approximating where objects within a frame will move.
Lumiere generates 80 frames compared to 25 frames from Stable Video Diffusion, and its results are more realistic than those of competitors like Runway and Stable Video Diffusion.
Google's Lumiere is not yet available for testing, but it shows the company's potential to develop an AI video platform that is comparable or superior to other AI video generators.
Despite the promising technology, Google acknowledges the risk of misuse for creating fake or harmful content and emphasizes the need for tools to detect biases and malicious use cases.

Google’s Lumiere brings AI video closer to real than unreal

Key takeaways:

Comments (0)

Newsletter