Llama.MIA — fork of Llama.cpp with interpretability features

The author discusses the use of llama.cpp for learning about transformers and experimenting with LLM visualizations and mechanistic interpretability. Initially, the code was not thread safe and used hardcoded values and global variables. However, it has now been refactored, with most of the code moved into hooks/callbacks. The new version is called Llama.MIA, which stands for "mechanistic interpretability application". Currently, only the CPU version is supported and it has only been tested with Llama2.

The author provides detailed instructions on how to set up and use Llama.MIA, including how to clone the code, build the application, install Python dependencies, and run the inference. The author also explains how to use various features of Llama.MIA, such as attention map visualization, computation graph printout, logit lens, attention head zero-ablation, and saving and loading tensors. These features allow users to interpret a transformer's hidden internal state, verify the responsibility of certain behaviors, and analyze connections between components of a transformer.

Key takeaways:

The author has been using llama.cpp for learning about transformers and experimenting with LLM visualizations and mechanistic interpretability.
The code has been refactored to be more efficient and a new version called Llama.MIA has been introduced, which stands for “mechanistic interpretability application”.
The post provides detailed instructions on how to set up and use Llama.MIA, including how to visualize attention maps, print computation graphs, use logit lens, and perform attention head zero-ablation.
It also provides information on how to save and load tensors, which is useful for analyzing connections between components of a transformer.

Llama.MIA — fork of Llama.cpp with interpretability features

Key takeaways:

Comments (0)

Newsletter