The company plans to deploy GPUs to its data centers worldwide by the end of 2024, making it the most widely distributed cloud-AI inference platform. It also plans to launch its next generation of compute servers with GPUs in Q2 2024. The AI Gateway now supports Anthropic, Azure, AWS Bedrock, Google Vertex, and Perplexity, and will add persistent logs, custom metadata, and secrets management in Q2 2024. Vectorize, which allows developers to persist embeddings and query for the closest match, is set for general availability in June 2024.
Key takeaways:
- Cloudflare's Workers AI inference platform is now Generally Available, with improved performance, reliability, and lower costs on popular models.
- Cloudflare continues its partnership with Hugging Face, adding 4 more models to their platform and making it easier to run these models on Workers AI.
- Cloudflare now supports Python in Workers, allowing developers to write Cloudflare Workers in the second most popular programming language in the world.
- Cloudflare's AI Gateway now supports more providers including Anthropic, Google Vertex, and Perplexity, and plans to add persistent logs, custom metadata, and secrets management in Q2 of 2024.