The article also highlights the work of Ruslan Salakhutdinov, a professor at Carnegie Mellon University and former director of AI research at Apple, who believes that AI will soon be able to perform useful tasks for users. His team has developed a virtual testing ground called VisualWebArena to train and test AI agents. Despite some failures, these AI agents have shown promising results in performing complex tasks, indicating a significant potential for AI to make digital life easier in the near future.
Key takeaways:
- An experimental AI voice helper called vimGPT, built by developer Ishan Shah, has shown impressive capabilities in navigating the web and interacting with online forms, outperforming current virtual assistants like Siri, Alexa, and Google Assistant.
- VimGPT is built on GPT-4V, a multimodal version of OpenAI’s language model, which allows it to analyze requests and determine actions more reliably than text-only software.
- Ruslan Salakhutdinov, a professor at Carnegie Mellon University and former director of AI research at Apple, believes that the next evolution of virtual assistants will be AI agents that can perform useful tasks on the web, and has developed a virtual testing ground called VisualWebArena for honing the skills of such AI helpers.
- Despite the impressive capabilities of AI agents, there are still challenges to overcome, as evidenced by the CMU team's experiments where AI agents only achieved complex objectives 16% of the time compared to humans' 88% success rate.