/terms/llms-txt · 5 min read · intermediate
LLMS.txt
Citation status
Last checked 2026-06-22
What is LLMS.txt?
A community proposal by Jeremy Howard of Answer.AI, introduced in September 20241. The file lives at https://example.com/llms.txt and contains: site title, a one-paragraph description, then a Markdown list of canonical resources organized by topic. The intent is to give LLMs a curated, low-friction entry point to a site, rather than expecting them to crawl an entire sitemap and infer relevance.
Minimum viable llms.txt
# GEO Glossary
> The 2026 reference dictionary for Generative Engine Optimization (GEO / AEO / AIO / LLMO).
> Every term ships with DefinedTerm JSON-LD and a per-engine citation-status tracker.
## Core terms
- [Generative Engine Optimization](https://aisearchglossary.com/terms/generative-engine-optimization): definition, history, status in 2026
- [DefinedTerm schema](https://aisearchglossary.com/terms/defined-term-schema): JSON-LD pattern for definition-style content
- [Attribution rate](https://aisearchglossary.com/terms/attribution-rate): per-engine citation KPI
## Reference
- [About the project](https://aisearchglossary.com/about)
- [Citation status methodology](https://aisearchglossary.com/about#methodology)
Serve it at /llms.txt with Content-Type: text/markdown; charset=utf-8.
Status in 2026
Community-proposal stage, not on a standards track. The IETF has launched an "AI Preferences Working Group" for related standards work (AI preference signaling, training-data opt-out), but llms.txt itself is not part of that effort. No major AI engine has officially announced crawler support, and Google has been explicit that it does not use llms.txt: John Mueller (Google Search Advocate) publicly stated in 2025 that "no AI system currently uses llms.txt," comparing it to the deprecated meta keywords tag, and Gary Illyes (Google) confirmed at Search Central Live in July 2025 that Google does not support llms.txt and is not planning to. Anthropic publishes its own Claude developer docs in llms.txt format at platform.claude.com/llms.txt (legacy docs.claude.com and docs.anthropic.com both 301-redirect there); practitioners report seeing AI engines occasionally fetch llms.txt during ChatGPT and Claude agentic browsing sessions, but these are community observations and direct evidence of llms.txt-driven citation behavior remains anecdotal.
Adoption snapshot (2026): broadest in technical documentation and reference resources. Sites publishing llms.txt include Anthropic, Cloudflare, Stripe, Zapier, Cursor, Pinecone, and the large set of Mintlify-hosted developer docs sites. Mintlify added auto-generation of llms.txt on November 20, 2024, which expanded adoption across its hosted-docs customer base overnight. Most of the "thousands of dev tools' docs" reported by industry sources as having llms.txt are Mintlify-hosted and got the file via that platform rollout rather than as a deliberate per-site decision. The format is more common on developer-facing documentation than on consumer or marketing sites.
How llms.txt relates to robots.txt and sitemap.xml
The three files are commonly grouped but have very different purposes and maturity levels:
| File | Primary purpose | Maturity |
|---|---|---|
robots.txt |
Controls which crawlers may fetch which URLs (REP, RFC 9309) | Standardized; widely supported |
sitemap.xml |
Helps search engines discover URLs (sitemaps.org protocol) | Standardized; widely supported |
llms.txt |
Provides a curated, human-edited site overview for LLM/agent consumption | Community proposal; no major engine has confirmed support |
The maturity column matters: shipping llms.txt is a low-cost-low-confirmed-upside bet; shipping robots.txt and sitemap.xml are well-understood baseline expectations.
How to apply
LLMS.txt is cheap insurance against an upside that is not yet vendor-confirmed. No engine has officially endorsed it as a fetch target, and Google has publicly said it does not use it. Practitioners report seeing ChatGPT and Claude agentic browsers fetch it during some sessions, but these are community observations, not vendor commitments. Three steps to ship it anyway:
- Follow Jeremy Howard's spec at llmstxt.org: an H1 with the site/project name, a blockquote summary, optional detail prose, then optional H2-delimited sections listing canonical URLs as Markdown links. Most sites need 30–80 lines total. Keep it under one screen of plain text.
- Optionally ship
llms-full.txtfor documentation-heavy sites: this is the larger sibling that includes full content of your key pages, not just links.llms-full.txtwas originally developed by Mintlify in collaboration with Anthropic and later folded into the official llms.txt proposal; Alex Albert (Head of Claude Relations at Anthropic) announced Anthropic's adoption on X in November 2024 alongside Mintlify's platform-wide rollout. Anthropic'sllms-full.txtatplatform.claude.com/llms-full.txt(legacydocs.claude.comanddocs.anthropic.comboth 301-redirect to platform.claude.com after the 2025-26 docs consolidation) runs ~481K tokens; for most sites, 50K–200K is plenty. Generate it server-side from your content directory at build time. - Serve as
text/markdownif your host allows: the spec doesn't mandate a Content-Type, buttext/markdown; charset=utf-8is the cleanest signal. Most static hosts default totext/plain, which still parses fine.
What to skip: investing in elaborate llms.txt formatting until adoption is broader. The "ship something minimal" payoff is far higher than "ship something perfect."
What llms.txt does not do
Reader-facing failure modes worth being explicit about:
- It does not replace
robots.txt. llms.txt has no access-control semantics. Userobots.txt(andCache-Control: noai, where supported) for crawl preferences; use llms.txt only as a curated reading list. - It does not force or prevent AI bot crawling. Engines that ignore llms.txt continue to behave however they were going to behave. Shipping llms.txt does not add a new permission boundary.
- It does not guarantee ChatGPT / Perplexity / Gemini / Claude will cite the linked pages. There is no published vendor signal that links from llms.txt receive preferential treatment in citation selection.
- It does not get your content into LLM training data. Pretraining corpora (Common Crawl, books, code) are independent of llms.txt; the file does not control inclusion or exclusion.
- It should not contain private, paywalled, or unpublished content. The file is publicly fetchable. Treat it as a public reading list, not a privileged channel.
How it relates to other concepts
- Companion to robots.txt (which controls bot access) and sitemap.xml (which lists every URL for indexing). See the Status section's three-file boundary table for the maturity comparison; these three files are commonly grouped but only two are standardized.
- Spec-layer companion to LLM Optimization: this entry covers the file's spec, format, and adoption patterns; the LLMO entry covers the broader strategic context of where llms.txt fits in an LLM-side optimization program. The two pages are designed as a deliberate pair: spec view here, strategy view there.
- Site-architecture implementation hook for GEO, though one whose downstream effect is not vendor-confirmed.
- Often paired with
llms-full.txt, the longer-form alternative that includes the actual content of key pages, not just links. Thellms-full.txtformat originated from a Mintlify–Anthropic collaboration (November 2024) and was subsequently incorporated into the official llms.txt proposal. - Useful entry point for AI Search Optimization programs that want a single canonical machine-readable site overview.
Footnotes
-
The llms.txt proposal site, authored by Jeremy Howard (Answer.AI). llmstxt.org. ↩
Part of Infrastructure· editorial cluster, not a semantic link
Cluster pillar: AI access control→
Also in this cluster: AI access control · AI crawler blocking · AI crawler bots · AIPREF (AI usage preferences) · IndexNow Protocol · +2 more
Related terms
Mentioned in· auto-generated from other terms' related lists
FAQ
- Is LLMS.txt an official standard?
- No. It is a community proposal, not a binding standard yet. No major AI engine has officially confirmed support. Adopting it has no downside; engines that don't support it ignore the file, and the cost to author is minimal.
- What goes in LLMS.txt?
- Site name, a one-paragraph description, then Markdown sections with links to canonical resources organized by topic. Keep it concise. The file is meant to be readable in a single screen by both humans and AI models.
- Does LLMS.txt replace sitemap.xml?
- No. Sitemap.xml is for search-engine indexing and lists every URL on your site. LLMS.txt is curated AI guidance: a hand-selected, topically-organized subset. Sites concerned with AI-search visibility should ship both.
Sources & further reading
- LLMS.txt proposal (Jeremy Howard, Answer.AI, September 2024)2024-09-03
- Anthropic Claude developer docs in llms.txt format (canonical example)
- Anthropic Claude developer docs in llms-full.txt format (the comprehensive companion)
- Mintlify: Simplifying docs for AI with /llms.txt (announcing auto-generated llms.txt support, November 20, 2024)2024-11-20
Get the monthly digest
New terms shipped that week, plus one observation from the AI-citation tracker.