In a broad sense, intelligent agents are autonomous problem solvers endowed with perception, judgment, and action capabilities based on data gathered from their surroundings. Recent applications of this idea have shown promise in developing language agents that can use natural language to do a wide range of complex tasks in various contexts. This is especially true when these agents are constructed using large language models (LLMs). Agents of this type can mimic human thought and language because they draw on human expertise in the form of LLMs. This allows people to be flexible in their use of tools, adapt to new situations, reason linguistically, and develop multi-agent systems on the fly.
LLMs should grasp human interaction, reasoning, and planning and ensure grounding in the necessary contexts to properly construct the foundation of language agents. LLMs’ natural language capabilities allow them to closely mimic human conversation, thinking, and planning. However, environment-based execution is typically accomplished through general-purpose code or domain-specific APIs, such as those used to manage web browsers, communicate with operating system command line interface terminals, and control robotic arms.
To fill this gap, a new study by the University of Hong Kong, XLang Lab, Salesforce Research, Sea AI Lab, University of Washington, and MIT CSAIL present Lemur and Lemur-Chat, two state-of-the-art, publicly available models that have been pre-trained and fine-tuned to achieve harmony between text and code. Through carefully crafted pre-training and instruction fine-tuning steps, the researchers improved the original Llama-2-70B. To ensure enhanced capabilities in coding ability while retaining performance in natural language ability, they constructed a code-centric corpus based on The Stack, including 90 billion tokens with a 10:1 text-to-code ratio. This prototype is known as Lemur. To create the instruction-following model, Lemur-Chat, they first pretrained it using around 100K instances from both text and code. Lemur and Lemur-Chat have been proven to be the most well-rounded open-source models after undergoing extensive examinations across 8 textual and coding benchmarks.
In addition, this effort sets out to provide agent standards for evaluating the core competencies of linguistic agents in various settings. The team focuses particularly on their skill with tools and their ability to root themselves in both environmental and social feedback. They also investigate the difficulties inherent in real-world, partially visible situations, where the agent must operate based on incomplete information and perform additional actions to fill in the gaps. Experiments show that Lemur-Chat performs better in 12 of the 13 agent benchmarks compared to other open-source models. This exemplifies how Lemur-Chat can outperform existing open-source models for language agents by bridging the performance gap between open-source and commercial alternatives by combining natural and coding talents.
The results of these tests demonstrate the importance of combining linguistic and computational skills in agent-based settings. Models like Llama-2-70B-Chat, which excel in natural language processing but struggle with coding, can efficiently use basic tools to aid reasoning because the action space is constrained, and the effort of employing such tools is low. In contrast, the action space is typically enormous when confronted with sophisticated decision-making scenarios like web browsing and home navigation, and models with high coding abilities have an edge when constructing complex executable action sequences. In sum, Lemur’s superior performance can be attributed to its natural language processing and programming superiority. This study lays the groundwork for creating sophisticated language agents that can function well in a wide range of settings by shedding light on optimizing the synergy between natural and programming languages.
Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 31k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
If you like our work, you will love our newsletter..
We are also on WhatsApp. Join our AI Channel on Whatsapp..
Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone\’s life easy.