The author also shares their personal experience in creating models using an Nvidia RTX 3060 card, which often required hours or even a day to produce dozens of decent pictures. The quality of the output was largely dependent on the quality and variety of the input photos, with different angles and types of shots needed for each subject. The author also noted that the output often mirrored the most common features in the input photos, such as glasses.
Key takeaways:
- There are various methodologies for adding new concepts to Stable Diffusion, including Hypernetworks, textual inversion, dreambooth, and a more recent technique, LoRa.
- Dreambooth is considered better than hypernetworks and textual inversion for images of people.
- Creating each model using an Nvidia RTX 3060 card can take hours or even a day, but can yield dozens of decent pictures of each subject.
- The quality of the output depends heavily on the quality and variety of the input photos, including different angles, body shots, and whether the subject is wearing glasses or not.