Microsoft, Beihang release MoRA, an efficient LLM fine-tuning technique

Researchers from Microsoft and Beihang University have developed a new technique for fine-tuning large language models (LLMs) called MoRA. This parameter-efficient fine-tuning (PEFT) technique addresses limitations of other techniques like low-rank adaptation (LoRA), and is particularly useful for tasks that require the model to acquire new knowledge. MoRA uses a square matrix instead of low-rank matrices, allowing it to achieve the highest possible rank in the space of the model’s original dimensions, and outperforms LoRA in memorization tasks.

The researchers compared MoRA and LoRA models on various tasks and found that MoRA significantly outperformed LoRA in memorization tasks and came close to the performance of a fully fine-tuned model with fewer parameters and training steps. In instruction tuning and mathematical reasoning tasks, MoRA's performance was almost on par with LoRA. However, for continual pretraining in biomedical and financial domains, MoRA outperformed LoRA. The researchers have released an open-source implementation of MoRA, which could be a valuable tool for enterprise applications.

Key takeaways:

Researchers from Microsoft and Beihang University have introduced a new technique for fine-tuning large language models (LLMs) called MoRA, which is more cost-effective and addresses limitations of other techniques like LoRA.
MoRA uses a square matrix instead of low-rank matrices, enabling it to learn new knowledge more effectively than a LoRA model of the same size.
In tests, MoRA significantly outperformed LoRA on memorization tasks and performed almost on par with LoRA on instruction tuning and mathematical reasoning tasks.
The researchers have released an open-source implementation of MoRA, which could be a valuable tool for enterprise applications that want to add new knowledge to base models.

Microsoft, Beihang release MoRA, an efficient LLM fine-tuning technique

Key takeaways:

Comments (0)

Newsletter