/terms/llms-txt · 5 min read · intermediate

LLMS.txt

LLMS.txt is a proposed text file at the root of a website, written in Markdown, that provides a curated, AI-readable summary of the site's most important resources for large language models. It is a community-proposed companion-style file, not a replacement for robots.txt or sitemap.xml, which serve different purposes (crawl control, URL discovery) and are widely standardized in ways llms.txt is not.

Citation status

ChatGPTPerplexityClaudeCopilotGemini

Last checked 2026-06-22

What is LLMS.txt?

A community proposal by Jeremy Howard of Answer.AI, introduced in September 20241. The file lives at https://example.com/llms.txt and contains: site title, a one-paragraph description, then a Markdown list of canonical resources organized by topic. The intent is to give LLMs a curated, low-friction entry point to a site, rather than expecting them to crawl an entire sitemap and infer relevance.

Minimum viable llms.txt

# GEO Glossary

> The 2026 reference dictionary for Generative Engine Optimization (GEO / AEO / AIO / LLMO).
> Every term ships with DefinedTerm JSON-LD and a per-engine citation-status tracker.

## Core terms

- [Generative Engine Optimization](https://aisearchglossary.com/terms/generative-engine-optimization): definition, history, status in 2026
- [DefinedTerm schema](https://aisearchglossary.com/terms/defined-term-schema): JSON-LD pattern for definition-style content
- [Attribution rate](https://aisearchglossary.com/terms/attribution-rate): per-engine citation KPI

## Reference

- [About the project](https://aisearchglossary.com/about)
- [Citation status methodology](https://aisearchglossary.com/about#methodology)

Serve it at /llms.txt with Content-Type: text/markdown; charset=utf-8.

Status in 2026

Community-proposal stage, not on a standards track. The IETF has launched an "AI Preferences Working Group" for related standards work (AI preference signaling, training-data opt-out), but llms.txt itself is not part of that effort. No major AI engine has officially announced crawler support, and Google has been explicit that it does not use llms.txt: John Mueller (Google Search Advocate) publicly stated in 2025 that "no AI system currently uses llms.txt," comparing it to the deprecated meta keywords tag, and Gary Illyes (Google) confirmed at Search Central Live in July 2025 that Google does not support llms.txt and is not planning to. Anthropic publishes its own Claude developer docs in llms.txt format at platform.claude.com/llms.txt (legacy docs.claude.com and docs.anthropic.com both 301-redirect there); practitioners report seeing AI engines occasionally fetch llms.txt during ChatGPT and Claude agentic browsing sessions, but these are community observations and direct evidence of llms.txt-driven citation behavior remains anecdotal.

Adoption snapshot (2026): broadest in technical documentation and reference resources. Sites publishing llms.txt include Anthropic, Cloudflare, Stripe, Zapier, Cursor, Pinecone, and the large set of Mintlify-hosted developer docs sites. Mintlify added auto-generation of llms.txt on November 20, 2024, which expanded adoption across its hosted-docs customer base overnight. Most of the "thousands of dev tools' docs" reported by industry sources as having llms.txt are Mintlify-hosted and got the file via that platform rollout rather than as a deliberate per-site decision. The format is more common on developer-facing documentation than on consumer or marketing sites.

How llms.txt relates to robots.txt and sitemap.xml

The three files are commonly grouped but have very different purposes and maturity levels:

File Primary purpose Maturity
robots.txt Controls which crawlers may fetch which URLs (REP, RFC 9309) Standardized; widely supported
sitemap.xml Helps search engines discover URLs (sitemaps.org protocol) Standardized; widely supported
llms.txt Provides a curated, human-edited site overview for LLM/agent consumption Community proposal; no major engine has confirmed support

The maturity column matters: shipping llms.txt is a low-cost-low-confirmed-upside bet; shipping robots.txt and sitemap.xml are well-understood baseline expectations.

How to apply

LLMS.txt is cheap insurance against an upside that is not yet vendor-confirmed. No engine has officially endorsed it as a fetch target, and Google has publicly said it does not use it. Practitioners report seeing ChatGPT and Claude agentic browsers fetch it during some sessions, but these are community observations, not vendor commitments. Three steps to ship it anyway:

  • Follow Jeremy Howard's spec at llmstxt.org: an H1 with the site/project name, a blockquote summary, optional detail prose, then optional H2-delimited sections listing canonical URLs as Markdown links. Most sites need 30–80 lines total. Keep it under one screen of plain text.
  • Optionally ship llms-full.txt for documentation-heavy sites: this is the larger sibling that includes full content of your key pages, not just links. llms-full.txt was originally developed by Mintlify in collaboration with Anthropic and later folded into the official llms.txt proposal; Alex Albert (Head of Claude Relations at Anthropic) announced Anthropic's adoption on X in November 2024 alongside Mintlify's platform-wide rollout. Anthropic's llms-full.txt at platform.claude.com/llms-full.txt (legacy docs.claude.com and docs.anthropic.com both 301-redirect to platform.claude.com after the 2025-26 docs consolidation) runs ~481K tokens; for most sites, 50K–200K is plenty. Generate it server-side from your content directory at build time.
  • Serve as text/markdown if your host allows: the spec doesn't mandate a Content-Type, but text/markdown; charset=utf-8 is the cleanest signal. Most static hosts default to text/plain, which still parses fine.

What to skip: investing in elaborate llms.txt formatting until adoption is broader. The "ship something minimal" payoff is far higher than "ship something perfect."

What llms.txt does not do

Reader-facing failure modes worth being explicit about:

  • It does not replace robots.txt. llms.txt has no access-control semantics. Use robots.txt (and Cache-Control: noai, where supported) for crawl preferences; use llms.txt only as a curated reading list.
  • It does not force or prevent AI bot crawling. Engines that ignore llms.txt continue to behave however they were going to behave. Shipping llms.txt does not add a new permission boundary.
  • It does not guarantee ChatGPT / Perplexity / Gemini / Claude will cite the linked pages. There is no published vendor signal that links from llms.txt receive preferential treatment in citation selection.
  • It does not get your content into LLM training data. Pretraining corpora (Common Crawl, books, code) are independent of llms.txt; the file does not control inclusion or exclusion.
  • It should not contain private, paywalled, or unpublished content. The file is publicly fetchable. Treat it as a public reading list, not a privileged channel.

How it relates to other concepts

  • Companion to robots.txt (which controls bot access) and sitemap.xml (which lists every URL for indexing). See the Status section's three-file boundary table for the maturity comparison; these three files are commonly grouped but only two are standardized.
  • Spec-layer companion to LLM Optimization: this entry covers the file's spec, format, and adoption patterns; the LLMO entry covers the broader strategic context of where llms.txt fits in an LLM-side optimization program. The two pages are designed as a deliberate pair: spec view here, strategy view there.
  • Site-architecture implementation hook for GEO, though one whose downstream effect is not vendor-confirmed.
  • Often paired with llms-full.txt, the longer-form alternative that includes the actual content of key pages, not just links. The llms-full.txt format originated from a Mintlify–Anthropic collaboration (November 2024) and was subsequently incorporated into the official llms.txt proposal.
  • Useful entry point for AI Search Optimization programs that want a single canonical machine-readable site overview.

Footnotes

  1. The llms.txt proposal site, authored by Jeremy Howard (Answer.AI). llmstxt.org.

Part of Infrastructure· editorial cluster, not a semantic link

Cluster pillar: AI access control

Also in this cluster: AI access control · AI crawler blocking · AI crawler bots · AIPREF (AI usage preferences) · IndexNow Protocol · +2 more

Mentioned in· auto-generated from other terms' related lists

FAQ

Is LLMS.txt an official standard?
No. It is a community proposal, not a binding standard yet. No major AI engine has officially confirmed support. Adopting it has no downside; engines that don't support it ignore the file, and the cost to author is minimal.
What goes in LLMS.txt?
Site name, a one-paragraph description, then Markdown sections with links to canonical resources organized by topic. Keep it concise. The file is meant to be readable in a single screen by both humans and AI models.
Does LLMS.txt replace sitemap.xml?
No. Sitemap.xml is for search-engine indexing and lists every URL on your site. LLMS.txt is curated AI guidance: a hand-selected, topically-organized subset. Sites concerned with AI-search visibility should ship both.

Sources & further reading

Get the monthly digest

New terms shipped that week, plus one observation from the AI-citation tracker.

More about what you'll get