Does AI Know What an Apple Is? She Aims to Find Out. | Quanta Magazine
๐ Abstract
The article discusses the work of Ellie Pavlick, a computer scientist studying language models at Brown University and Google DeepMind. It explores her research on finding evidence of understanding within large language models (LLMs) and the philosophical questions around the concept of "meaning" in language.
๐ Q&A
[01] Introduction
1. What does "understanding" or "meaning" mean, empirically? What, specifically, does Pavlick look for? Pavlick believes that meaning involves concepts, and she wants to find evidence of these concepts within the neural networks of language models. She looks for internal structures or representations that play a causal role in the model's behavior, such as a "retrieve-capital-city" vector that allows the model to consistently retrieve the correct capital city for a given country.
2. What examples has Pavlick found of this internal structure? Pavlick has found a small vector within a language model that seems to encode the connection between a country and its capital city. This vector can be used to retrieve the correct capital when asked about a specific country, suggesting the model has broken down this information into a systematic representation.
3. How does grounding relate to these representations? Grounding refers to the idea that meaning is anchored in non-linguistic inputs like sensory perceptions and social interactions. Pavlick argues that even for abstract concepts like "democracy," the grounding could be in internal conceptual representations, rather than direct connections to the physical world. She provides the example of a language model learning the relationships between colors, even without direct sensory experience of them.
[02] Making Science Out of Philosophy
1. How can these philosophical-sounding questions about meaning and understanding be scientific? Pavlick argues that language models provide a concrete, empirical platform for exploring these questions, rather than relying on thought experiments. She sees it as an exciting opportunity to deeply understand these new systems, rather than dismissing them as "stochastic parrots."
2. How does Pavlick approach measuring success in this field, given the debate around basic methods and terminology? Pavlick believes the focus should be on providing precise, human-understandable descriptions of the behaviors we care about, rather than getting caught up in debates over semantics like "Does it have meaning?" She wants to focus on developing scientifically sound methods for analyzing the internal representations of language models.
3. What kind of research does Pavlick think is needed to make progress on these deep questions about intelligence? Pavlick believes the next 10 years of research needs to focus on the "not very sexy" methodological work of developing rigorous ways to identify and analyze the internal representations of language models. She sees this as essential groundwork for eventually answering the deeper questions about the nature of intelligence.