The author also discusses a refactoring in the commit, where the keys for the RMSNorm module are renamed if the model's config file identifies the model as "YiForCausalLM". The tokenizer model is new, but similar to OpenLlama, and the vocabulary is twice as large. Despite these changes, the author concludes that the architecture appears to be the same as Llama, but acknowledges that there may be additional changes that have been overlooked.
Key takeaways:
- The Yi model uses the same architecture as the Llama model, with the exception of two renamed tensors.
- There is a request for the Yi team to rename these tensors to match the Llama architecture, to facilitate wider adoption and compatibility with existing tools.
- There are concerns about licensing and respect for the intent of Yi's License if the model is officially released in Llama format.
- The tokenizer model in Yi is new and has a larger vocabulary, but it loads without issue and appears to be similar to the Llama model.