Zilong Wang, a Student Researcher, and Chen-Yu Lee, a Research Scientist from the Cloud AI Team, discuss the challenges of reasoning over tabular data in natural language processing (NLP). They highlight the difficulty language models face in comprehending and utilizing structured tabular data due to their training on plain text. To address this issue, they introduce the “Chain-of-Table” framework to enhance table understanding tasks by training large language models (LLMs) to outline their reasoning step by step, transforming complex tables into simpler segments for in-depth analysis. This approach has led to significant improvements and state-of-the-art results on various benchmarks.
The Chain-of-Table framework involves dynamically planning operations and updating tables iteratively to reflect the reasoning chain over tabular data. By generating intermediate tables at each step, LLMs can more accurately predict answers to questions based on tabular data. The framework consists of three main stages: planning the next operation, generating arguments for the operation, and executing the operation to create new intermediate tables. This iterative process enhances interpretability and understanding while guiding the LLM to more reliable answers.
Experimental results on public benchmarks like WikiTQ and TabFact demonstrate that Chain-of-Table outperforms generic and program-aided reasoning methods, achieving better accuracy and robustness on challenging questions and larger tables. The framework’s atomic operations significantly improve performance and exhibit graceful performance declines with increasing complexity and table size. Overall, Chain-of-Table shows promise in enhancing reasoning over tabular data for NLP tasks.
Source link