with Episodic Memory for LLM Agents
๐ Abstract
The article introduces AriGraph, a novel memory architecture for LLM agents that integrates semantic and episodic memories in the form of a knowledge graph. This allows the agent to construct a structured world model by learning from interactions with the environment. The proposed Ariadne agent, equipped with AriGraph, demonstrates superior performance compared to alternative memory approaches like full history, summarization, and retrieval-augmented generation (RAG) in various text-based game tasks involving navigation, object manipulation, and cooking.
๐ Q&A
[01] AriGraph World Model
1. What is the structure of the AriGraph world model? The AriGraph world model G = (Vs, Es, Ve, Ee) consists of:
- Vs: a set of semantic vertices representing objects extracted from observations
- Es: a set of semantic edges representing relationships between objects
- Ve: a set of episodic vertices, each containing an observation received from the environment
- Ee: a set of episodic edges connecting the semantic edges extracted from each observation to the corresponding episodic vertex
2. How does the agent learn and update the AriGraph world model?
- The agent extracts new semantic triplets (object1, relation, object2) from each observation and adds them to the semantic memory.
- The agent detects and removes outdated semantic edges by comparing them with the new triplets.
- The agent adds a new episodic vertex containing the current observation and connects it to all the semantic edges extracted from that observation.
3. How does the agent retrieve relevant information from the AriGraph?
- Semantic search: The agent uses a pre-trained Contriever model to find the most relevant semantic triplets based on the current query.
- Episodic search: The agent calculates the relevance of each episodic vertex based on the number of semantic triplets connected to it and returns the k most relevant ones.
[02] Ariadne Cognitive Architecture
1. What are the key components of the Ariadne agent architecture? The Ariadne agent architecture consists of:
- Working memory: Stores the current observation, recent history of observations and actions, relevant semantic and episodic memories retrieved from the AriGraph.
- Planning module: Uses the content of the working memory to generate or update a plan as a series of sub-goals.
- Decision-making module: Selects the most suitable action aligned with the current plan's objectives, adhering to the ReAct framework.
2. How does the Ariadne agent utilize its graph-based memory for exploration?
- The agent extracts information about exits and spatial connections between locations from the semantic graph.
- It uses this information to identify unexplored exits and plan optimal routes to target locations, extending its action space with "go to location" commands.
3. How does the Ariadne agent's architecture differ from alternative memory approaches for LLM agents?
- The separation of planning and decision-making allows the LLM to focus on distinct cognitive processes.
- The structured knowledge representation in the form of a graph, combined with episodic memory, enables more efficient retrieval and reasoning compared to unstructured memory approaches like full history or summarization.
- The graph-based exploration capabilities help the agent navigate complex environments more effectively than LLMs relying solely on textual information.
[03] Experimental Results
1. What are the key tasks used to evaluate the Ariadne agent? The agent was evaluated on three types of text-based game tasks:
- Treasure Hunt: Unlock a golden locker by finding a chain of keys and clues.
- Cleaning: Tidy up a house by identifying and returning misplaced items to their proper locations.
- Cooking: Prepare a meal by gathering the required ingredients and following the correct preparation steps.
2. How did the Ariadne agent perform compared to the baseline approaches?
- In all three tasks, the Ariadne agent with the AriGraph world model significantly outperformed the baseline agents using full history, summarization, and RAG-based memory.
- The Ariadne agent was able to successfully complete the Treasure Hunt and Cooking tasks, while the baseline agents struggled or failed to do so.
- In the Cleaning task, the Ariadne agent performed better than the baselines, though episodic memory sometimes led to confusion due to the agent's recollection of the original object locations.
3. How did the Ariadne agent compare to human performance on these tasks?
- On the Treasure Hunt and Cooking tasks, the Ariadne agent's performance was comparable to or better than the top human players.
- In the Cleaning task, the human players outperformed the Ariadne agent, likely due to the agent's confusion caused by episodic memory.
- Overall, the Ariadne agent with the AriGraph world model demonstrated superior performance compared to the baseline LLM agents and was competitive with human players on the more complex tasks.
</output_format>