The impressive performance in various reasoning tasks has been showcased by several Large Language Models (LLMs) like GPT-4, PaLM, and LLaMA. To enhance the functionality and performance of LLMs further, more effective prompting methods and increasing model size have been implemented, both of which contribute to improved reasoning performance. These approaches fall into two categories: (i) methods that rely on a single query to complete the reasoning process, such as those used for prompt engineering; and (ii) methods that utilize multiple LLM queries to generate different plausible reasoning paths, breaking down complex problems into smaller ones. Examples of this type of reasoning include Least-to-Most, ToT, and GoT.
However, both types of methods have limitations:
It is not practical to manually design single-query reasoning systems task by task as they often depend on prior assumptions or relevant examples of reasoning processes. Multi-query reasoning systems are computationally intensive as they recursively expand reasoning paths to find a unique intrinsic structure for each task. Both single-query and multi-query reasoning systems are restricted by their reasoning structures and examples, failing to derive general and high-level guidelines or thoughts from past tasks, which could enhance efficiency and accuracy when solving similar problems.
Introducing a new approach to address these limitations, a team of researchers from Peking University, UC Berkeley, and Stanford University have developed the Buffer of Thoughts (BoT). This innovative and flexible framework for thought-augmented reasoning aims to enhance the accuracy, efficiency, and resilience of LLMs across a wide range of tasks. A key component of BoT is the meta-buffer, a small library that stores a set of generalizable, high-level ideas (thought-templates) extracted from various problem-solving procedures. These thought-templates can be reused for other tasks, facilitating effective thought-augmented reasoning and configuration with a specific reasoning structure.
BoT is designed to be stable and scalable, with a buffer manager included to dynamically update the meta-buffer’s capacity as more tasks are completed. The three main benefits of this approach are:
Enhanced Precision: By utilizing shared thought-templates, high-level thoughts can be instantiated to adaptively tackle various tasks, eliminating the need to construct reasoning structures from scratch and significantly enhancing reasoning precision. Streamlined Reasoning: By directly using informative historical reasoning structures, the proposed thought-augmented reasoning can streamline reasoning processes and eliminate complex multi-query procedures. BoT’s approach to retrieving and instantiating thoughts mirrors human brain processes, enhancing LLMs’ ability to consistently solve similar problems, improving the model’s robustness, and demonstrating significant enhancements in accuracy, efficiency, and resilience when applied to various tasks.
The researchers have developed a buffer manager to extract ideas from different solutions, enhancing the meta-buffer’s capacity as more tasks are completed. They conducted comprehensive experiments on ten challenging tasks that require extensive reasoning. BoT outperforms previous state-of-the-art methods by 51% on Checkmate-in-One, 11% on Game of 24, and 20% on Geometric Shapes, with an average cost of only 12% of multi-query prompting approaches.
The proposed approach greatly improves accuracy while maintaining efficient and robust reasoning. However, for problems requiring human-like ingenuity, the method may have limited applicability as these problems often lack precise thought-templates. Additionally, the resulting thought-templates may not be of the highest quality if BoT uses a less robust model to initialize the meta-buffer, as weaker models have restricted reasoning and instruction-following capabilities. Moving forward, BoT reveals the following paths: 1. Creating an open-domain system, such as an agent model, by combining BoT with external resources. 2. Optimizing the distillation of thought-templates to enhance their effectiveness as templates for more complex tasks.
Check out the Paper and GitHub. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.
If you like our work, you will love our newsletter. Don’t forget to join our 44k+ ML SubReddit.