SV3D offers major benefits in generalization and view-consistency of generated outputs by adapting the Stable Video Diffusion image-to-video diffusion model with camera path conditioning. It also proposes improved 3D optimization by generating arbitrary orbits around an object. The model implements disentangled illumination optimization and a new masked score distillation sampling loss function to reliably output quality 3D meshes from single image inputs. Furthermore, SV3D introduces significant advancements in novel view synthesis (NVS), delivering coherent views from any given angle with proficient generalization, enhancing pose-controllability, and ensuring consistent object appearance across multiple views.
Key takeaways:
- Stable Video 3D (SV3D) is a new model that takes a single object image and generates novel multi-views and 3D meshes of the object, improving the quality and multi-view compared to previous models like Stable Zero123.
- The model comes in two variants: SV3D_u, which generates orbital videos from single image inputs, and SV3D_p, which can handle both single images and orbital views, allowing for the creation of 3D video along specified camera paths.
- Stable Video 3D uses video diffusion models, which provide major benefits in generalization and view-consistency of generated outputs, and includes improved 3D optimization and a new masked score distillation sampling loss function.
- Stable Video 3D introduces significant advancements in 3D generation, particularly in novel view synthesis (NVS), delivering coherent views from any given angle with proficient generalization, enhancing pose-controllability, and ensuring consistent object appearance across multiple views.