Language Agents with Unified Data Formats, Modular Design, and Open-Source LLMs

The article introduces Lumos, a language agent developed by the Allen Institute for AI, University of California, Los Angeles, and the University of Washington. Lumos uses a unified data format and a modular framework, which includes planning, grounding, and execution modules. It is trained with approximately 40,000 diverse high-quality subgoal/action annotations and is designed to support a range of interactive tasks. Lumos has demonstrated competitive performance, outperforming GPT-series agents on web/complex QA tasks and larger open agents on math tasks.

The Lumos architecture consists of a planning module that decomposes complex tasks into high-level subgoals, a grounding module that converts these subgoals into executable actions, and an execution module that interacts with external tools and environments. The article also discusses two Lumos formulations, Lumos-Iterative and Lumos-Onetime, and the use of Language Models (LLMs) to convert ground-truth intermediate reasoning steps into high-quality annotations. Lumos has shown superior performance compared to baseline formulations and has demonstrated its generalizability on unseen tasks.

Key takeaways:

Lumos is a language agent that unifies a suite of complex interactive tasks and achieves competitive performance with GPT-4/3.5-based and larger open-source agents. It consists of planning, grounding, and execution modules.
Lumos is trained with approximately 40K diverse high-quality subgoal/action annotations from ground-truth reasoning steps in existing benchmarks with GPT-4.
Lumos outperforms GPT-4/3.5-based agents on complex QA and web tasks, and larger language agents on maths tasks. It also surpasses larger open LLM agents and domain-specific agents by a large margin on an unseen task, WebShop.
The Lumos training annotations are one of the largest resources for language agent fine-tuning, covering web, complex QA and math task types. The annotations help achieve better performance than those produced by the Self-Instruct method and passed by rigorous execution sanity checking.

Language Agents with Unified Data Formats, Modular Design, and Open-Source LLMs

Key takeaways:

Comments (0)

Newsletter