The /llms.txt file – llms-txt
🌈 Abstract
The article discusses the proposal to add a /llms.txt
file to websites to provide information for large language models (LLMs) in a concise and structured format. The key points are:
- Websites are now used to provide information not just for humans, but also for LLMs, such as to enhance development environments.
- The
/llms.txt
file is a Markdown file that provides background information and links to more detailed Markdown files, in a format that is both human and LLM-readable. - The goal is to identify the most important information to provide to AI helpers in an appropriate form, as converting HTML pages into LLM-friendly plain text is difficult.
- The
/llms.txt
file follows a specific format with sections for the project name, a summary, optional details, and lists of linked Markdown files. - This approach is designed to coexist with existing web standards like sitemaps and robots.txt, providing curated information for LLMs.
🙋 Q&A
[01] Section Name
1. What is the purpose of the /llms.txt file? The /llms.txt file is proposed to provide information for large language models (LLMs) in a concise and structured format, making it easier for LLMs to access and understand the key details about a website or project.
2. How is the /llms.txt file structured? The /llms.txt file follows a specific format:
- It starts with an H1 header for the project/site name (required)
- Includes a blockquote with a short summary
- May have additional details in paragraphs or lists
- Contains one or more sections with H2 headers, each containing a list of Markdown file links with optional descriptions
- There is also an "Optional" section for secondary information that can be skipped if needed
3. How does the /llms.txt file relate to existing web standards? The /llms.txt file is designed to complement existing standards like robots.txt and sitemaps:
- Robots.txt specifies what automated tools can access, while /llms.txt provides context for allowed content
- Sitemaps list all pages, while /llms.txt offers a curated overview for LLMs
- /llms.txt can also reference structured data markup to help LLMs understand how to interpret the information
[02] Existing Standards
1. How does the /llms.txt file differ from a sitemap (sitemap.xml)? The key differences are:
- Sitemaps list all indexable human-readable pages, while /llms.txt focuses on providing information specifically for LLMs
- Sitemaps don't include LLM-readable versions of pages or links to external sites, which may be helpful for LLMs
- The aggregate information in a sitemap may be too large to fit in an LLM context window, unlike the curated /llms.txt content
2. What is the purpose of the "Optional" section in the /llms.txt file? The "Optional" section is for secondary information that can be skipped if a shorter context is needed. This allows the /llms.txt file to provide a concise set of the most important details, while also offering additional resources for LLMs that can access more comprehensive information.
[03] Next Steps
1. What is the current status of the /llms.txt specification? The /llms.txt specification is currently an open proposal, with a GitHub repository hosting the overview and allowing for community input and discussion. There is also a community Discord channel available for sharing implementation experiences and best practices.
2. What recommendations are given for creating effective /llms.txt files? The article provides the following guidelines for creating effective /llms.txt files:
- Use concise, clear language
- Include brief, informative descriptions when linking to resources
- Avoid ambiguous terms or unexplained jargon
- Test the /llms.txt file by running it through a tool that expands it into an LLM context file and see how well language models can answer questions about the content