Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
๐ Abstract
The article studies how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages. This underexplored problem poses new challenges at the pre-writing stage, including how to research the topic and prepare an outline prior to writing. The authors propose STORM, a writing system for the Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking. STORM models the pre-writing stage by (1) discovering diverse perspectives in researching the given topic, (2) simulating conversations where writers carrying different perspectives pose questions to a topic expert grounded on trusted Internet sources, and (3) curating the collected information to create an outline.
๐ Q&A
[01] Discovering Diverse Perspectives
1. How does STORM discover diverse perspectives on the given topic?
- STORM surveys existing Wikipedia articles on similar topics and extracts their tables of contents to identify different viewpoints that can contribute to a comprehensive article.
- STORM adds "basic fact writer focusing on broadly covering the basic facts about the topic" as one of the perspectives to ensure the coverage of fundamental information.
2. How does STORM use the discovered perspectives to guide the question asking process?
- STORM prompts the language model to generate questions from the viewpoint of each identified perspective, which leads to more varied and in-depth questions compared to a generic approach.
[02] Simulating Conversations
1. What is the purpose of simulating conversations in STORM?
- The theory of questions and question asking highlights that answers to existing questions often give rise to new questions, enabling a dynamic process of iterative research.
- STORM simulates a conversation between a Wikipedia writer and a topic expert, where the writer generates questions based on the topic, the assigned perspective, and the conversation history, and the expert provides answers grounded on trusted Internet sources.
2. How does STORM ensure the trustworthiness of the information provided by the simulated expert?
- STORM first prompts the language model to break down the complex query into a set of search queries, then evaluates the search results using a rule-based filter to exclude untrustworthy sources according to Wikipedia guidelines.
- The language model synthesizes the trustworthy sources to generate the answer, and these sources are also added to the reference set for the final article generation.
[03] Creating the Article Outline
1. How does STORM leverage the internal knowledge of language models to create the article outline?
- STORM first prompts the language model to generate a draft outline given only the topic, which provides a general but organized framework.
- STORM then refines the outline by prompting the language model with the topic, the draft outline, and the simulated conversations.
2. How does the outline creation stage in STORM differ from the baseline approaches?
- Directly generating an outline from the topic alone (Direct Gen) can miss topic-specific aspects, while retrieval-augmented approaches (RAG, oRAG) may present unorganized information, making it challenging for the language model to construct a coherent outline.
- STORM's multi-stage approach of discovering perspectives and simulating conversations leads to outlines with higher heading soft recall and entity recall compared to the baselines.
[04] Writing the Full-Length Article
1. How does STORM leverage the outline and references collected during the pre-writing stage to generate the full-length article?
- STORM uses the section titles and headings of the outline to retrieve relevant documents from the reference set based on semantic similarity.
- With the relevant information at hand, the language model is prompted to generate each section with citations.
- STORM also prompts the language model to synthesize a summary of the entire article to form the lead section, aligning with Wikipedia's stylistic norms.
2. What steps does STORM take to improve the coherence and quality of the generated article?
- STORM prompts the language model to delete repeated information across the generated sections to improve coherence.
- STORM evaluates the citation quality of the generated article using a separate language model, finding that 84.83% of the sentences are supported by their citations.
</output_format>