Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Nov 05, 2024 - arxiv.org
The paper introduces WebRL, a self-evolving online curriculum reinforcement learning framework, designed to train high-performance web agents using open Large Language Models (LLMs). WebRL addresses three main challenges in building LLM web agents: scarcity of training tasks, sparse feedback signals, and policy distribution drift in online learning. The framework incorporates a self-evolving curriculum, a robust outcome-supervised reward model (ORM), and adaptive reinforcement learning strategies to ensure consistent improvements.

WebRL has been applied to transform open Llama-3.1 and GLM-4 models into proficient web agents, significantly improving their success rates on WebArena-Lite. The open models trained with WebRL outperformed GPT-4-Turbo and GPT-4o, as well as previous state-of-the-art web agents trained on open LLMs. The study demonstrates WebRL's effectiveness in bridging the gap between open and proprietary LLM-based web agents, suggesting a path towards more accessible and powerful autonomous web interaction systems.

Key takeaways:

  • The paper introduces WebRL, a self-evolving online curriculum reinforcement learning framework for training high-performance web agents using open Large Language Models (LLMs).
  • WebRL addresses key challenges in building LLM web agents, such as scarcity of training tasks, sparse feedback signals, and policy distribution drift in online learning.
  • WebRL was applied to transform open Llama-3.1 and GLM-4 models into proficient web agents, significantly improving their success rates and outperforming previous state-of-the-art web agents trained on open LLMs.
  • The findings demonstrate WebRL's effectiveness in bridging the gap between open and proprietary LLM-based web agents, suggesting potential for more accessible and powerful autonomous web interaction systems.
View Full Article

Comments (0)

Be the first to comment!