01-ai/Yi-34B · llama-compatibility

The article discusses the similarities between the Yi and Llama architectures, noting that Yi uses the exact same structure as Llama, with the exception of two renamed tensors. The author suggests that the open-source community will likely republish Yi with the tensors renamed to conform to the Llama architecture, and urges the Yi team to consider this change to increase adoption. However, the author also notes potential licensing issues, as using the Llama structure and resources requires adherence to Llama's licensing agreements.

The author also discusses a refactoring in the commit, where the keys for the RMSNorm module are renamed if the model's config file identifies the model as "YiForCausalLM". The tokenizer model is new, but similar to OpenLlama, and the vocabulary is twice as large. Despite these changes, the author concludes that the architecture appears to be the same as Llama, but acknowledges that there may be additional changes that have been overlooked.

Key takeaways:

The Yi model uses the same architecture as the Llama model, with the exception of two renamed tensors.
There is a request for the Yi team to rename these tensors to match the Llama architecture, to facilitate wider adoption and compatibility with existing tools.
There are concerns about licensing and respect for the intent of Yi's License if the model is officially released in Llama format.
The tokenizer model in Yi is new and has a larger vocabulary, but it loads without issue and appears to be similar to the Llama model.

01-ai/Yi-34B · llama-compatibility

Key takeaways:

Comments (0)

Newsletter