Summarize by Aili

Insights into LLM Long-Context Failures: When Transformers Know but Don’t Tell

🌈 Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

1. What is the main focus of this study?

The study explores the long-context reasoning capabilities of Large Language Models (LLMs) by probing their hidden representations.
It aims to investigate how LLMs handle long-context integration and whether they can effectively utilize information from the middle or end of long contexts.

2. What are the key findings of the study?

The study reveals a "know but don't tell" phenomenon, where LLMs can accurately identify the position of crucial information within the context, but often fail to leverage this knowledge effectively in generating accurate responses.
The results indicate a disconnect between LLMs' ability to encode positional information and their ability to utilize that information in their outputs.

1. What are the datasets and tasks used in the study?

The study uses two tasks from Liu et al. (2023b):
- Key-Value pairs retrieval (kv-pairs): Identify a value given its key in a context containing 100 key-value pairs.
- Multi-document question answering (MDQA): Given a question, identify the relevant document and produce an answer, with the context containing 30 documents.

2. How do the authors probe the LLMs' hidden representations?

The authors train separate linear classifiers for each layer of the LLM, using the last token embedding as input and the gold kv-pair/document ID as the target output.
The probing classifiers are used to measure how accurately the LLM's hidden representations can identify the position of the target information.

1. What is the key finding from the maximum probing accuracy experiment?

The results show that the LLM's hidden representations can accurately identify the location of the target information, even in cases where the LLM fails to generate the correct answer.
This suggests a disconnect between the model's ability to locate the information and its ability to effectively utilize that information in its responses.

1. What insights do the authors gain from the probing across layers experiment?

The results reveal that LLMs locate target information gradually across their layers, with middle-positioned information requiring more layers to be accurately identified.
For the MDQA task, the probing accuracy patterns vary significantly depending on the position of the target information within the input context.

1. What is the key finding from the experiment on the relationship between locating and generating target information?

The authors find a statistically significant negative correlation between the layer at which the LLM identifies the target information and its final output accuracy.
This suggests that the earlier the model can locate the target information within its layers, the more likely it is to generate an accurate final answer.

1. What is the main conclusion of the study?

The study demonstrates that LLMs can capture the location of crucial information in their hidden representations, but this knowledge does not always translate into accurate responses, revealing a "know but don't tell" phenomenon.
The findings highlight the importance of understanding the disconnect between LLMs' ability to encode and utilize positional information, which could inform future advancements in improving the long-context processing capabilities of LLMs.

</output_format>

Shared by Daniel Chen ·