Release v0.1.33 · ollama/ollama

The markdown data discusses updates and changes to various models including Llama 3, Phi 3 Mini, Dolphin Llama 3, and Qwen 110B. It also mentions the introduction of experimental concurrency features to Ollama, which allow for handling multiple requests and loading multiple models simultaneously. These features can be enabled by setting the environment variables for `ollama serve`.

The data also highlights several bug fixes, such as issues causing the API to hang, out of memory errors on Apple Silicon Macs, and issues with running Mixtral architecture models. Additionally, it acknowledges the contributions of new contributors to the project. The full changelog is available from v0.1.32 to v0.1.33-rc5.

Key takeaways:

New models have been introduced including Llama 3 by Meta, Phi 3 Mini by Microsoft, Dolphin Llama 3 by Eric Hartford, and Qwen 110B.
Several issues have been fixed including termination issues, out of memory errors on Apple Silicon Macs, and errors when running Mixtral architecture models.
New concurrency features are being introduced to Ollama, allowing for handling multiple requests and loading multiple models simultaneously.
Several new contributors have made their first contributions to the project.

Release v0.1.33 · ollama/ollama

Key takeaways:

Comments (0)

Newsletter