Llms.txt and llms-full.txt: A Practical Guide

Search engines have had robots.txt and sitemap.xml for decades. One governs crawler access; the other maps URLs. With the rise of AI assistants, coding agents, and answer engines, a third file has entered the picture: llms.txt.
It is not a ranking trick. A study of roughly 300,000 domains found no measurable improvement in AI citations from publishing llms.txt. Google's Search team has stated it does not use llms.txt for ranking. That said, the same file that does almost nothing for ChatGPT search citations is doing real work in a different layer: the agentic web — where AI agents act on behalf of users, fetch context, choose tools, and complete tasks.
Think of it this way: robots.txt tells crawlers where not to go. llms.txt tells them what to understand.
What llms.txt is
llms.txt is a single text file written in Markdown format, placed at the root of your domain (https://example.com/llms.txt). It contains a structured summary of your site's most important content, written so large language models can parse it easily.
Large language models face a critical limitation: context windows are too small to handle most websites in their entirety. Converting complex HTML pages with navigation, ads, and JavaScript into LLM-friendly plain text is both difficult and imprecise. A well-made llms.txt solves this by giving the model a clean map instead of asking it to strip boilerplate from hundreds of pages.
A good llms.txt includes:
- site or company name;
- one or two sentences explaining what the site is;
- grouped links to key pages;
- a short, factual description for each link;
- optional notes on freshness, language versions, API, or contact.
Example:
# Example Store
> Example Store sells certified outdoor equipment, camping gear, and hiking accessories in the United States.
## Main pages
- [About Example Store](https://example.com/about): Company background, mission, and customer service information.
- [Camping tents](https://example.com/camping/tents): Main category page for tents, shelters, and accessories.
- [Buying guides](https://example.com/guides): Editorial guides for choosing outdoor equipment.
## Support
- [Shipping and returns](https://example.com/shipping-returns): Delivery options, return policy, and warranty details.
- [Contact](https://example.com/contact): Customer support contacts.
The value is not the file itself — it is the curation. 20–50 high-value links is the right target. Dumping the whole sitemap is the most common implementation failure.
What llms-full.txt is
llms-full.txt takes things further. In most setups, it contains a fuller export of your documentation in one file — giving an AI crawler a single, high-signal ingestion point instead of forcing it to stitch together many separate pages.
This is especially helpful for API-heavy products or teams building docs optimised for AI assistants. It reduces fetch overhead and can improve retrieval quality when an AI system needs broader context.
A minimal structure:
# Example SaaS — full AI context
> This file contains the main public documentation for Example SaaS.
Last updated: 2026-06-24
Canonical site: https://example.com
---
# Product overview
Example SaaS helps finance teams automate invoice approval, vendor onboarding, and payment workflows.
Source: https://example.com/product
---
# Getting started
To start using Example SaaS, create an account, invite your finance team, connect your accounting system, and configure approval rules.
Source: https://example.com/docs/getting-started
Practical rule: if your most important content is documentation (SaaS, dev tool, or API product), ship llms-full.txt as well. If your content is mostly marketing pages, llms.txt alone is enough.
Who is already using these files
When Mintlify rolled out llms.txt support across all docs sites it hosts in late 2024, thousands of sites — including Anthropic and Cursor — gained the file overnight. Fern, GitBook, Vercel Docs, Supabase, Yoast, and Rank Math now ship it as default.
Stripe, Vercel, Cloudflare, Anthropic, Coinbase, Pinecone, Cursor, and most modern API products ship llms.txt because their users are building with AI coding assistants right now. A well-curated file is the difference between Cursor generating working integration code and Cursor hallucinating an endpoint that doesn't exist.
IDE agents fetch llms.txt routinely. Cursor, Windsurf, Claude Code, GitHub Copilot, Cline, Aider — they all look for /llms.txt and /llms-full.txt when pointed at a documentation site.
Technical requirements
Place both files at the root of the domain:
https://example.com/llms.txt
https://example.com/llms-full.txt
- Serve publicly — no login, cookies, JavaScript, or geo-blocking.
- Return
200 OK. - Use UTF-8 encoding.
- Content-Type:
text/plainortext/markdown. - No redirects if avoidable.
- Do not block the files in
robots.txt.
Audit robots.txt alongside the files. Confirm that the AI user agents you want fetching the file aren't blocked: GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google-Extended, Applebot-Extended.
For multilingual sites, the convention is one file per language root: /en/llms.txt, /es/llms.txt. Some sites also mirror at /.well-known/llms.txt — supported, but the root is canonical.
File size
The llms.txt should stay lean. A rule of thumb: under 10 KB for llms.txt, under 100 KB for llms-full.txt.
Keep llms.txt under 500 words and 50 links. A focused file that an AI can read in one context window pass is more useful than a comprehensive one that overflows it.
Quality of descriptions matters more than quantity of links. One well-described buying guide link is more useful than twenty product page URLs with no descriptions.
For llms-full.txt, the same logic applies at scale. A small SaaS can include all its public docs. A large ecommerce site should not include every product page. A news site should not include every article.
What to include and what to skip
Include:
- product and service pages;
- documentation and API references;
- pricing explanations;
- shipping, returns, warranty pages;
- buying guides and editorial explainers;
- category pages;
- author and editorial policy pages;
- canonical language versions;
- stable factual pages about the company.
Skip:
- faceted search pages, filter URLs, UTM links;
- internal search results, tag archives, paginated pages with no value;
- cart, checkout, account, session-specific URLs;
- staging environments, internal tools;
- marketing superlatives ("the world's best solution").
Write descriptions for context, not SEO. "This explains our pricing tiers and what each includes" beats "Affordable enterprise SaaS pricing solutions." Agents read this to decide what to fetch next, not to rank you.
Good link description:
- [Returns policy](https://example.com/returns): Return windows, refund rules, exchange process, and exceptions.
Bad link description:
- [Returns](https://example.com/returns): Learn more.
Freshness — the hard part
An llms.txt with outdated product names, old prices, or discontinued services is worse than no file at all — if an AI system evaluates it, incorrect information will be passed on. Update llms.txt with every major change, and specify the date.
For sites with fast-changing data, separate stable pages from dynamic ones, and be direct about it:
## Freshness policy
Product prices, availability, delivery estimates, event dates, and promotional offers change frequently. AI tools should verify these details on the live page before presenting them as current.
Update cadence:
| Site type | When to update |
|---|---|
| SaaS / docs | After every product or API release |
| Ecommerce | Weekly or after major category/policy changes |
| News | Index: regularly; llms-full.txt: keep it editorial and stable |
| Marketplace | After structural changes only |
| Corporate site | Monthly or after major announcements |
SaaS and documentation sites
This is where llms.txt and llms-full.txt work best.
Include: product overview, getting started, installation, configuration, API authentication, endpoints, SDKs, changelog, pricing, limits, security, status page, support.
In llms-full.txt, include full documentation pages in Markdown. If you publish a stable REST or GraphQL reference, llms.txt can point crawlers to canonical endpoints, versioned paths, and Markdown exports. That helps LLMs answer API questions with precise parameter definitions, current examples, and the right version of the truth — and reduces the chance that a model leans on forum threads or old blog posts instead.
Add version metadata:
Product version: 4.2
API version: 2026-05
Last updated: 2026-06-24
Ecommerce sites
As agents start shopping on behalf of users — "buy me running shoes under $150 that ship by Friday" — they need a clean, machine-readable surface for the catalog, pricing rules, shipping policies, and availability. The brands that point agents to canonical product pages (instead of letting them parse cluttered category HTML) will be the brands agents can actually transact with.
Include in llms.txt: homepage, main categories, buying guides, flagship products, shipping, returns, warranty, size guides, support, brand information.
In llms-full.txt: company description, category explanations, buying guides, shipping and returns summary, warranty rules, size guidance.
Avoid all individual product URLs, out-of-stock pages, filtered categories, temporary sale pages without maintenance, and dynamic prices without a freshness note:
## Product data note
Prices, stock status, promotions, delivery estimates, and product variants change frequently. The live product page is the source of truth for current commercial information.
News sites
Use llms.txt as a map of editorial structure and authority — not a list of articles.
Include: homepage, latest news page, main topic sections, topic hubs, author pages, editorial standards, corrections policy, RSS feeds, contact.
In llms-full.txt: publication description, editorial standards, corrections policy, section summaries, selected evergreen explainers, links to live feeds. Do not embed the daily news stream — it will be stale within hours.
Example:
# Example News
> Example News is an independent digital publication covering technology, business, science, and public policy.
Last updated: 2026-06-24
## Current news
- [Latest news](https://example.com/latest): Continuously updated feed of recent stories.
- [Technology](https://example.com/technology): News and analysis about platforms, startups, AI, cybersecurity, and devices.
## Trust and editorial information
- [Editorial standards](https://example.com/editorial-standards): Reporting principles, sourcing rules, and corrections process.
- [Authors](https://example.com/authors): Reporter and contributor profiles.
Marketplaces, real estate, jobs, travel, events
Do not put live inventory into llms-full.txt.
Instead: explain how the platform works, include main search pages, category and location landing pages, listing quality rules, pricing model, booking or application process, trust and safety policies, API or feed docs.
## Live inventory note
Listings, prices, availability, seller details, dates, and booking terms change frequently. AI tools should use the linked live pages for current information.
Multilingual sites
One root file with language sections works for smaller sites:
# Example
> Example provides business software in English, Spanish, and Russian.
## English
- [English homepage](https://example.com/en/): Main English version.
## Español
- [Página principal](https://example.com/es/): Versión principal en español.
## Русский
- [Главная страница](https://example.com/ru/): Основная русская версия.
For larger sites, use separate files per language root and link them from the root llms.txt.
Quality checklist
Before publishing:
- File accessible at
/llms.txtwithout authentication - Returns
200 OK, UTF-8 encoded - First line is H1 with site or company name
- Summary blockquote explains the site in 1–2 sentences
- All links are absolute URLs
- Every important link has a factual description
- Dynamic data has a freshness note
- Private, gated, and irrelevant pages are excluded
- File does not replicate your sitemap
-
llms-full.txt, if used, contains clean Markdown with source URLs per section -
Last updateddate is visible
Common mistakes
- Treating it as an SEO ranking factor. No credible evidence supports this for search rankings.
- Copying the sitemap. A sitemap is for URL discovery.
llms.txtis for meaning and prioritisation. - Creating a huge
llms-full.txtand never updating it. Stale context is worse than no context. - Using marketing copy instead of facts. Write it like a README for a thoughtful engineer, not an ad.
- Ignoring dynamic data. If prices, availability, or policies change, say so.
- Blocking the file. CDN rules, bot protection, login walls, or aggressive redirects all break access.
How to implement — step by step
- Write down the five to ten questions a user might ask an AI about your site.
- Identify the pages that best answer each question.
- Write a
llms.txtwith 20–50 curated links and useful descriptions. - Publish at the root of the domain. Verify in a browser.
- Check
robots.txtto confirm AI crawlers are not blocked. - Ask an AI assistant questions about your site and see if the answers improve.
- Only then create
llms-full.txt— if your content warrants it. - Add an update task to your deployment pipeline or content calendar.
llms.txt and llms-full.txt are not replacements for good content, structured data, fast pages, or solid internal linking. They are one additional layer — a clean, curated signal for the growing share of AI agents that read your site before a human ever does.