GitHub - google/maxtext: A simple, performant and scalable Jax LLM!

MaxText is an open-source, high-performance, and highly scalable Large Language Model (LLM) developed by Google. It is written in pure Python/Jax and is designed for Google Cloud TPUs and GPUs for training and inference. MaxText is designed to be simple and "optimization-free", achieving high MFUs and scalability from a single host to large clusters. It supports various models such as Llama2, Mistral, and Gemma and is intended to be a starting point for ambitious LLM projects in both research and production.

The article also provides detailed instructions for getting started with MaxText, including guides for running decode and finetuning. It presents runtime performance results for different models and compares MaxText to other alternatives like MinGPT/NanoGPT, Nvidia/Megatron-LM, and Pax. The article also discusses features and diagnostics such as the collection of stack traces for debugging, ahead of time compilation for efficient startup and restart times, and automatic upload of logs to Vertex Tensorboard.

Key takeaways:

MaxText is a high performance, highly scalable, open-source Large Language Model (LLM) written in pure Python/Jax, targeting Google Cloud TPUs and GPUs for training and inference.
MaxText supports various models such as Llama2, Mistral and Gemma and provides high performance, well-converging training in int8 and scale training to ~51K chips.
The tool provides features like Ahead of Time Compilation (AOT) which allows you to compile the main train_step for target hardware without using the target hardware, flagging any out of memory information and saving the compilation for fast startup and restart times on the target hardware.
MaxText also supports automatic upload of logs collected in a directory to a Tensorboard instance in Vertex AI, aiding in efficient debugging and troubleshooting.

GitHub - google/maxtext: A simple, performant and scalable Jax LLM!

Key takeaways:

Comments (0)

Newsletter