Introducing Stable Video 3D: Quality Novel View Synthesis and 3D Generation from Single Images

Stable Video 3D (SV3D) is a new model that advances 3D technology by taking a single object image and generating novel multi-views and 3D meshes of the object. It outperforms previous models like Stable Zero123 and other open-source alternatives, offering improved quality and multi-view. The model comes in two variants: SV3D_u, which generates orbital videos from single image inputs without camera conditioning, and SV3D_p, which accommodates both single images and orbital views for creating 3D video along specified camera paths. The model is available for commercial use with a Stability AI Membership and for non-commercial use, the model weights can be downloaded on Hugging Face.

SV3D offers major benefits in generalization and view-consistency of generated outputs by adapting the Stable Video Diffusion image-to-video diffusion model with camera path conditioning. It also proposes improved 3D optimization by generating arbitrary orbits around an object. The model implements disentangled illumination optimization and a new masked score distillation sampling loss function to reliably output quality 3D meshes from single image inputs. Furthermore, SV3D introduces significant advancements in novel view synthesis (NVS), delivering coherent views from any given angle with proficient generalization, enhancing pose-controllability, and ensuring consistent object appearance across multiple views.

Key takeaways:

Stable Video 3D (SV3D) is a new model that takes a single object image and generates novel multi-views and 3D meshes of the object, improving the quality and multi-view compared to previous models like Stable Zero123.
The model comes in two variants: SV3D_u, which generates orbital videos from single image inputs, and SV3D_p, which can handle both single images and orbital views, allowing for the creation of 3D video along specified camera paths.
Stable Video 3D uses video diffusion models, which provide major benefits in generalization and view-consistency of generated outputs, and includes improved 3D optimization and a new masked score distillation sampling loss function.
Stable Video 3D introduces significant advancements in 3D generation, particularly in novel view synthesis (NVS), delivering coherent views from any given angle with proficient generalization, enhancing pose-controllability, and ensuring consistent object appearance across multiple views.

Introducing Stable Video 3D: Quality Novel View Synthesis and 3D Generation from Single Images — Stability AI

Key takeaways:

Comments (0)

Newsletter