The author seeks to understand the parameter size models that can be loaded for various VRAM sizes (8 GB, 12 GB, 16 GB, and 24 GB) for inference, fine-tuning, and training. They also inquire about the impact of system RAM, considering sizes of 16 GB, 32 GB, 64 GB, and beyond.
Key takeaways:
- Playing with LLMs locally can be challenging due to the lack of clear documentation on what existing hardware can do.
- There is a need to understand what can be done with a decent gaming PC with a CUDA-compatible GPU, specifically Nvidia, when it comes to LLMs.
- The article seeks to understand what parameter size models can be loaded for various VRAM sizes for inference, fine tuning and training respectively.
- The VRAM sizes in question are 8 GB, 12 GB, 16 GB and 24 GB, and the system RAM sizes are 16 GB, 32 GB, 64 GB and beyond.