Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

GitHub - ehennenfent/live_illustrate: Live-ish illustration for TTRPG campaigns

Dec 21, 2023 - github.com
The TTRPG live_illustrate project uses a combination of Whisper, GPT-3.5, DALL-E, Flask, and HTMX to create a live illustration of a tabletop RPG session. The Whisper tool transcribes the live audio, GPT-3.5 extracts a description of the current setting from the transcript, and DALL-E draws the setting. Flask and HTMX are used to display a new image every few minutes. The project is designed to work better and worse than expected, with the generated images often being an amusingly flawed rendition of the ongoing scenario.

The project requires an OpenAI API key and costs about $1/hour to run. The cost can be reduced by adjusting the size of the generated images or the interval between them. Once installed, the 'illustrate' command line tool starts recording with the default microphone, creates a 'data\' directory for the generated images and transcripts, and starts a web server on 'localhost:8080' to display the generated images. The tool has several command line options to control the frequency of image generation, the amount of transcript to send to GPT3, and the fraction of the previous context retained each time an image is generated.

Key takeaways:

  • The TTRPG live_illustrate project uses Whisper to transcribe live audio of a tabletop RPG session, GPT-3.5 to extract a description of the current setting from the transcript, DALL-E to draw the setting, and Flask & HTMX to display a new image every few minutes.
  • The project provides a unique and interactive way to visualize tabletop RPG sessions, although the images generated may not always perfectly represent the described setting.
  • The project requires an OpenAI API key and costs about $1/hour to run, with options to reduce costs by adjusting image size and generation frequency.
  • Once installed, the tool can be run using the 'illustrate' command line tool, which starts recording with the default microphone, generates images and transcripts, and displays the images on a local web server.
View Full Article

Comments (0)

Be the first to comment!