The Granite models are expected to help programmers save time and energy by automating tasks such as creating tests and finding and fixing bugs. IBM also sees business benefits in these models due to their clear licensing and training methods, and the fact that the data has been cleaned and filtered for inappropriate language. This move by IBM aims to lower the entry level for developers who want to use LLMs and provide open-source tools for improving software development work.
Key takeaways:
- IBM has open-sourced its large language models (LLMs), known as the Granite Code Base models, which are trained on 3- to 4-terabyte tokens of code data and natural language code-related datasets.
- The Granite models are licensed under the Apache 2.0 license for both research and commercial use, a step that other major LLMs have not taken.
- These models, which have been trained on code from 116 programming languages, are designed for specific applications such as programming, and can be used for a range of developer tasks.
- IBM anticipates that developers will use the Granite LLMs to automate tasks such as generating unit tests, writing documentation, and running vulnerability tests, saving time and energy.