AI Industry is Trying to Subvert the Definition of “Open Source AI”

The Open Source Initiative's (OSI) definition of "open source AI" has been criticized for allowing secret training data and mechanisms, and for permitting development to be conducted in secret. Critics argue that this definition is confusing and misleading, as the training data for a neural network essentially serves as the source code. They claim that the OSI has been influenced by industry players who desire both corporate secrecy and the "open source" label.

The article suggests that while there is a need for a public AI option and genuine open source, there are also partially open models that require definition. It acknowledges the importance of privacy-preserving, federated methods of machine learning model training. The OSI defends its stance by stating that it aims to facilitate open source AI in fields where data cannot be legally shared, such as medical AI. The article proposes the term "open weights" instead of "open source" to better represent this concept.

Key takeaways:

The Open Source Initiative's definition of open source AI is criticized for allowing secret training data and mechanisms, and secret development.
There is confusion as many AI models are open source in name only, and the OSI is accused of being influenced by industry players wanting corporate secrecy and the open source label.
The author argues for a public AI option and real open source as a necessary component of it, while acknowledging the need for some sort of definition for partially open models.
The OSI defends the exclusion of some training data in open source AI, citing legal restrictions on data sharing in fields like medical AI and the need to protect sensitive personal information and Indigenous knowledge.

AI Industry is Trying to Subvert the Definition of “Open Source AI”

Key takeaways:

Comments (0)

Newsletter