The latest update includes automatic offloading as much of the running model as is supported by the GPU for maximum performance without crashes. It also fixes an issue where characters would be erased when running 'ollama run'. A new community project by @TwanLuttik has been added, who also made their first contribution in the update.
Key takeaways:
- Ollama for Linux is now available with GPU acceleration enabled out-of-the-box for Nvidia GPUs.
- Ollama can run on cloud servers with multiple GPUs attached and on WSL 2 with GPU support.
- Ollama maximizes the number of GPU layers to load to increase performance without crashing and supports CPU only, and small hobby gaming GPUs to super powerful workstation graphics cards like the H100.
- Updates in the new version include automatic offloading for maximum performance, a fix for an issue where characters would be erased when running 'ollama run', and a new community project by @TwanLuttik.