However, the author expresses disappointment over the proprietary nature of these models and the lack of transparency about their training data. The article highlights the growing concerns over the use of unlicensed copyrighted data for training these models and the ethical implications of the same. The author advocates for training transparency and the development of models using public domain or licensed content.
Key takeaways:
- Four new models from different vendors have been released in the past month that are benchmarking near or above GPT-4, these include Google Gemini 1.5, Mistral Large, Claude 3 Opus, and Inflection-2.5.
- None of these new models are openly licensed or have their weights available, and they are not transparent about their training data.
- There is a growing concern over the use of unlicensed copyrighted data for training these models, and the lack of transparency in their training process.
- Despite the lack of transparency, these models are still widely used, and the author emphasizes the importance of understanding how a model was trained in deciding its suitability for various tasks.