Large Language Models are Interpretable Learners
๐ Abstract
The paper explores the trade-off between expressiveness and interpretability in building human-centric predictive models. It proposes a novel framework called LLM-Symbolic Programs (LSPs) that leverages Large Language Models (LLMs) to bridge this gap. LSPs combine the power of LLMs with a minimal Domain-Specific Language (DSL) to construct interpretable decision rules. The paper also introduces the Interpretable-Learning Benchmark (IL-Bench) to evaluate the effectiveness of LSPs in extracting interpretable and accurate knowledge from diverse datasets across vision and text modalities. The results demonstrate that LSPs outperform traditional neurosymbolic programs and vanilla prompt optimization methods in terms of accuracy, interpretability, and generalization.
๐ Q&A
[01] LLM-Symbolic Programs
1. What is the key insight behind using LLMs to implement interpretable programs? The key insight is that LLMs encompass a variety of powerful, conditional probabilistic sub-models defined by their respective prompts. Crafting prompts for LLMs is equivalent to searching over the hypothesis space spanned by these sub-models, yielding an infinite set of neural network-based operations that are inherently interpretable and can serve as fundamental building blocks within Neurosymbolic Programs.
2. How does the Domain-Specific Language (DSL) of LSPs differ from traditional NSPs? Compared to traditional NSPs that require manually designing a comprehensive DSL, LSPs leverage the ability of LLMs to represent a wide range of functions via different prompting. LSPs use a minimalist DSL with only three components: the input, conditional branching, and LLM module.
3. How does the learning algorithm for LSPs work? The learning algorithm for LSPs follows a divide-and-conquer strategy. It starts with an empty program and the entire training set. At each step, a switch operator combined with an LLM module is added to the program. The LLM module is trained to fit the data subset assigned to that node, and the process is repeated recursively on the child nodes until the full program is constructed.
[02] Interpretable-Learning Benchmark (IL-Bench)
1. Why was the IL-Bench benchmark introduced? The goal of interpretable learning is for the model to acquire knowledge that is transferable to humans. IL-Bench was introduced to evaluate interpretable learning methods on classification tasks that are not zero-shot solvable by LLMs, requiring the model to learn additional knowledge beyond what is covered in pretraining.
2. What types of tasks are included in IL-Bench? IL-Bench includes both synthetic tasks with known predictive rules and real-world tasks such as fine-grained visual classification (FGVC) and novel visual concepts from the Palworld game. These tasks are designed to be challenging for LLMs, requiring the model to learn intricate details and patterns from the data.
3. How does the IL-Bench benchmark help evaluate interpretable learning methods? The IL-Bench benchmark provides a valuable resource for evaluating the performance, interpretability, and generalization capabilities of interpretable learning methods like LSPs. The diverse set of tasks, including both synthetic and real-world scenarios, allows for a comprehensive assessment of the models' ability to extract and represent interpretable knowledge from data.