IBM open-sources its Granite AI models

IBM has open-sourced its large language models (LLMs), known as the Granite Code Base models, which are trained on 3- to 4-terabyte tokens of code data and natural language code-related datasets. The models, licensed under the Apache 2.0 license for both research and commercial use, are designed for programming and support a range of developer uses. IBM has used these LLMs in its Watsonx Code Assistant products and they are now available for use with IBM and Red Hat's InstructLab.

The Granite models are expected to help programmers save time and energy by automating tasks such as creating tests and finding and fixing bugs. IBM also sees business benefits in these models due to their clear licensing and training methods, and the fact that the data has been cleaned and filtered for inappropriate language. This move by IBM aims to lower the entry level for developers who want to use LLMs and provide open-source tools for improving software development work.

Key takeaways

IBM has open-sourced its large language models (LLMs), known as the Granite Code Base models, which are trained on 3- to 4-terabyte tokens of code data and natural language code-related datasets.
The Granite models are licensed under the Apache 2.0 license for both research and commercial use, a step that other major LLMs have not taken.
These models, which have been trained on code from 116 programming languages, are designed for specific applications such as programming, and can be used for a range of developer tasks.
IBM anticipates that developers will use the Granite LLMs to automate tasks such as generating unit tests, writing documentation, and running vulnerability tests, saving time and energy.

IBM open-sources its Granite AI models - and they mean business

Key takeaways

Discussion (0)