The LeoLM models are a proof-of-concept of language acquisition for pretrained models and the first openly available German Foundation Model that meets today's standards. The research has several key contributions, including the release of a suite of German Foundation Language Models with a permissive license, the transfer of a thorough evaluation approach for base and chat models into German, and the demonstration that large-scale continued pretraining is possible without significant forgetting or loss of previous capabilities. The article concludes with acknowledgments to those who contributed to the project.
Key takeaways:
- LeoLM, a suite of German Foundation Language Models, has been introduced. These models are trained on a large-scale, high-quality German text corpus and are built on Llama-2.
- The models have been trained using a Stage 2 pretraining methodology, which involves initializing LeoLMs using Llama-2 weights and continuing training on a large German text corpus.
- LeoLM models have been evaluated using a set of English benchmarks translated into German. The results show that the models perform better on German tasks while slightly reducing scores on English tasks.
- The LeoLM model suite is a proof-of-concept of language acquisition for pretrained models and is the first openly available German Foundation Model that meets today's standards.