The new model was built from scratch over three months and its 8K context length allows for applications in legal document analysis, medical research, literary analysis, financial forecasting, and conversational AI. The model is available in two versions, a base model for tasks requiring higher accuracy and a small model for lightweight applications. Jina AI's future plans include publishing an academic paper detailing the technical aspects of `jina-embeddings-v2`, developing an OpenAI-like embeddings API platform, and launching German-English models.
- Jina AI has launched its second-generation text embedding model, `jina-embeddings-v2`, which supports an 8K (8192 tokens) context length, matching OpenAI's proprietary model in capabilities and performance.
- The `jina-embeddings-v2` model outperforms its OpenAI counterpart in several areas, including Classification Average, Reranking Average, Retrieval Average, and Summarization Average.
- The new model's 8K context length enables it to be used in various industry applications such as legal document analysis, medical research, literary analysis, financial forecasting, and conversational AI.
- Jina AI plans to publish an academic paper detailing the technical intricacies and benchmarks of `jina-embeddings-v2`, develop an OpenAI-like embeddings API platform, and launch German-English models.