OpenAI's Sora image generator, despite its initial appeal, still has a way to go in terms of realism and understanding of physics. However, it could be useful for producing "just in time" or "just good enough" videos, such as short-run ads for social media. Meanwhile, Google's Gemini 1.5 Pro, a new version of its large language model, offers a one-million-token context window and has been used by developers for a variety of tasks, from analyzing video content to reading through company reports and analyzing computer code.
Key takeaways:
- Google has announced a set of new large language models called 'Gemma', which are smaller than the previous Gemini models and can run on a laptop or desktop workstation, or in the Google cloud.
- OpenAI's Sora image generator, which creates videos using a hybrid architecture, still has some issues with the realism of its generated content, particularly in the movement of objects and generation of human hands.
- Google's new version of its Gemini LLM, Gemini 1.5 Pro, offers a one-million-token context window, and developers have been using it to perform tasks such as analyzing company reports and answering detailed questions about video content.
- Other AI news includes the New York Times' plans to debut a new generative AI ad tool, Chinese startup Moonshot AI raising over $1 billion, and OpenAI completing a deal that values the company at $80 billion.