You can now run the most powerful open source AI models locally on Mac M4 computers, thanks to Exo Labs

Exo Labs, a startup co-founded by Alex Cheema, has successfully run large language models (LLMs) on Apple's new M4 computer chip, which is available in the latest Mac Mini and Macbook Pro models. Cheema connected four Mac Mini M4 devices and a single Macbook Pro M4 Max with Exo’s open source software to run Alibaba’s software developer-optimized LLM Qwen 2.5 Coder-32B. The total cost of Cheema’s cluster was around $5,000 retail, significantly cheaper than a single NVidia H100 GPU.

Exo Labs is working on building enterprise-grade software offerings and believes that running AI models locally on devices controlled and owned by the user or enterprise will become more popular due to cost, privacy, security, and behavioral benefits. The company is also preparing to launch a free benchmarking website to provide detailed comparisons of hardware setups, including single-device and multi-device configurations.

Key takeaways:

Exo Labs, a startup founded in 2024, has successfully used Apple's new M4 computer chip to run powerful open source large language models (LLMs), including Meta’s Llama-3.1 405B, Nvidia’s Nemotron 70B, and Qwen 2.5 Coder-32B.
Running AI models locally on devices rather than the web can offer cost, privacy, security, and behavioral benefits. Exo Labs is working on building out its enterprise-grade software offerings to enable this.
Exo Labs' software allows AI workloads to be distributed across multiple devices, making AI more accessible for privacy and cost-conscious consumers and enterprises.
Exo Labs is preparing to launch a free benchmarking website to provide detailed comparisons of hardware setups, including single-device and multi-device configurations, to help users identify the best solutions for running LLMs based on their needs and budget.

You can now run the most powerful open source AI models locally on Mac M4 computers, thanks to Exo Labs

Key takeaways:

Comments (0)

Newsletter