## What is llms.txt?

llms.txt is a plain-text Markdown file served at the root of your domain (`/llms.txt`) that describes your service to large language models in human-readable prose. Where [OpenAPI](/kb/openapi) gives machines a structured spec and [robots.txt](/kb/robots-txt) tells crawlers what to fetch, llms.txt gives an LLM the *narrative context* it needs to understand what your service does, when an agent should use it, and which endpoints matter — in the format LLMs already process best.

The spec was proposed by Jeremy Howard (fast.ai, Answer.AI) in September 2024. It is intentionally lightweight: no schema validation, no JSON parsing, no versioning. A Markdown file with an H1 heading and a one-line summary. The simplicity is the feature — anyone can author one in five minutes, and any LLM can read it without special tooling.

## Why llms.txt matters

LLMs that browse the web have a triage problem. A single page on a modern site can be 200KB of HTML with most of the meaningful text behind JavaScript that the agent's fetcher can't render. Even when the agent can render JS, the signal-to-noise ratio is poor — navigation, ads, cookie banners, and tracking scripts dominate the bytes.

llms.txt is the inverse: a single document, plain text, fits in a few KB, leads with what matters. An agent retrieving `https://example.com/llms.txt` gets:

- What the service is, in one sentence
- The endpoints or pages that matter, listed
- Auth/payment rules in prose
- Example calls the agent can copy

This is the difference between "agent has to figure out your site" and "agent can decide whether your site is relevant in one fetch." For sites that want to be cited by ChatGPT, Claude, Perplexity, or any retrieval-augmented system, llms.txt is the cheapest possible affordance.

## How it works

There is no protocol negotiation. Agents request `/llms.txt` the same way they request `/robots.txt`, expect `text/markdown` or `text/plain` back, and parse the Markdown. The file MUST start with an H1 heading; everything else is convention.

A minimal valid llms.txt:

```markdown
# Your Service Name

> One-line description of what the service does.

## Overview

What your service is for, in 2-3 sentences. Lead with the use case an
agent would care about, not your company history.

## Endpoints

- `GET /api/search?q=` — Search the catalog (free)
- `POST /api/order` — Place an order (paid: x402)
- `GET /api/status` — Health check (free)

## Authentication

Free endpoints need no auth. Paid endpoints return HTTP 402 with an
x402 payment challenge. Settle in USDC on Base.

## Examples

Search for shoes:
  GET /api/search?q=shoes

Place an order:
  POST /api/order {"sku": "ABC123", "qty": 1}
```

That's the whole spec. Headings beyond H1 are optional. Bulleted lists, fenced code blocks, and links work because they're Markdown — agents parse them with whatever Markdown library they already use.

## llms.txt vs other agent-readability files

| File | Purpose | Format | Who reads it |
|---|---|---|---|
| `/llms.txt` | Narrative context for LLMs | Markdown | LLMs, browsing agents |
| `/llms-full.txt` | Full text corpus, one file | Markdown / plaintext | RAG pipelines, embedded agents |
| `/robots.txt` | Crawl permissions | Plaintext directives | Crawlers, bots |
| `/sitemap.xml` | URL index for crawlers | XML | Search engines |
| `/openapi.json` | Programmatic API spec | JSON/YAML | Code generators, API clients |
| `/.well-known/mcp` | Tool-server protocol | JSON-RPC | MCP-capable agents |

llms.txt does not replace any of these — it fills the gap between robots.txt (machine directives) and OpenAPI (machine specs) with a layer that is optimized for the *language* models that need to reason about what your site is for.

## The companion: llms-full.txt

The llmstxt.org spec defines an optional sibling file: [llms-full.txt](/kb/llms-full-txt). Convention is the same — Markdown, served at the root — but the content is different. llms.txt is the *directory* (a brief overview with pointers). llms-full.txt is the *corpus* (the full concatenated text content of your site in a single file).

Why both? Two different consumers:

- **Browsing agents** with limited context windows want the directory — pick what to fetch next based on llms.txt.
- **Non-browsing agents** (RAG pipelines, one-shot retrieval, embedded agents in mobile apps) want the corpus — ingest llms-full.txt once and answer questions without making more HTTP requests.

Ship both if you can. llms.txt is the marketing front; llms-full.txt is the knowledge base.

## Who's publishing llms.txt

Adoption has been broad across AI infrastructure, developer tools, and content platforms:

- **Anthropic** publishes llms.txt at [docs.anthropic.com/llms.txt](https://docs.anthropic.com/llms.txt) listing every page in the developer docs.
- **Cloudflare** publishes one at [developers.cloudflare.com/llms.txt](https://developers.cloudflare.com/llms.txt) for the Workers and platform docs.
- **Stripe** publishes one for the API reference.
- **Vercel, Mintlify, Netlify, Supabase, and Fly.io** all publish llms.txt for their developer documentation.
- **Many open-source projects** ship llms.txt automatically via documentation generators (Mintlify and Docusaurus both have plugins).

The pattern that's emerged: docs platforms publish llms.txt for their *documentation*, and product sites publish llms.txt for their *API surface*. Both are useful — agents discovering a product for the first time read the marketing llms.txt, then dive into the docs llms.txt when they need integration details.

## How to add llms.txt to your service

### 1. Decide what an agent needs to know

Skip everything an agent doesn't need: your company story, design philosophy, marketing copy. Lead with what the service *does* and what endpoints exist. If you can't write the summary in one sentence, your service is too broad — split it.

### 2. Author the file

Create `/llms.txt` as plain Markdown. Required: an H1 heading on the first line and a blockquote summary. Recommended: an "Endpoints" section, an "Authentication" section, and 2-3 example requests. Keep the whole file under 10KB.

### 3. Serve it at the root

The file MUST be at `/llms.txt` — not `/.well-known/llms.txt`, not `/docs/llms.txt`. Agents probe the root directly. Most static-site hosts let you drop a file in `public/` or `static/` and serve it as-is.

Set `Content-Type: text/markdown` or `text/plain`. Some agents reject `application/octet-stream`.

### 4. Link it from your homepage

Add a `rel="alternate"` link in your HTML head so agents and crawlers discover it:

```html
<link rel="alternate" type="text/markdown" href="/llms.txt" title="LLM-friendly description">
```

### 5. Ship llms-full.txt too

Concatenate your documentation pages into a single Markdown file at `/llms-full.txt`. Most docs generators can emit this. If you're hand-rolling, a build step that walks your Markdown sources and joins them with H1 separators is enough.

## Common errors and debugging

- **File served as HTML.** Static hosts sometimes wrap text files in their template. Check that `curl https://your-domain.com/llms.txt` returns the raw Markdown, not an HTML page that contains it.
- **Missing H1 on first line.** The spec requires the document to start with `#`. Files that lead with a blockquote, frontmatter, or HTML comment fail parsers that expect the H1.
- **Content negotiation overrides.** Some servers return HTML to browsers and Markdown to agents — but if the `Accept` header is wrong, agents get HTML. Always serve the file directly; don't gate it behind content negotiation.
- **Stale llms.txt.** Authors ship the file once and never update it. Agents that re-fetch on a schedule see stale endpoints. Treat llms.txt as a build artifact — regenerate when the API changes.
- **Too long.** Files over 50KB get truncated by some agents. If your llms.txt is that long, you actually want llms-full.txt — split into the directory and the corpus.

agentgrade's scanner probes `/llms.txt` directly and checks that the file exists, starts with an H1, and has non-trivial content.

## Frequently asked questions

### Is llms.txt the same as robots.txt?

No. robots.txt is directives for *crawlers* (what URLs they're allowed to fetch). llms.txt is *context* for LLMs (what your service is and how to use it). A site can and should publish both.

### Do search engines read llms.txt?

Some do, informally. Google has not committed to anything official. The primary consumers are LLM-powered tools — ChatGPT browsing, Claude with web search, Perplexity, Cursor's @web — which retrieve llms.txt when summarizing or citing your service.

### Should I author llms.txt by hand or generate it?

Hand-author the top-level llms.txt — it's short (10-50 lines) and the editorial judgment matters. Generate llms-full.txt from your docs build pipeline — it's long and mechanical.

### What if my service has paid endpoints?

Describe the payment in prose ("Paid endpoints return HTTP 402 with an x402 payment challenge"). Don't try to encode payment metadata in llms.txt — that's what [x402](/kb/x402) and [OpenAPI](/kb/openapi) `x-payment-info` are for.

### Does llms.txt replace OpenAPI?

No. OpenAPI is a strict machine spec — code generators consume it directly. llms.txt is narrative — LLMs read it to understand intent. Ship both: OpenAPI for tooling, llms.txt for context.

### How is llms.txt different from a SKILL.md?

[SKILL.md](/kb/skills) is *instructional* — a playbook the agent follows to perform a task. llms.txt is *descriptive* — context about what the service is for. SKILLs go inside agent runtimes (Claude Code, Cursor, etc.) and tell an agent *how to operate*. llms.txt sits on your site and tells *any* LLM *what you are*.

### Does the file have to be Markdown?

The spec says Markdown. In practice, plaintext works — most LLMs ignore Markdown syntax anyway. But Markdown is the recommended format because the headings give structure that agents can use to skip to the relevant section.

### Will publishing llms.txt help my SEO?

Indirectly. Traditional search engines don't rank pages on llms.txt presence. But AI Overviews, Perplexity, ChatGPT search, and Bing's generative answers increasingly cite sources — and a clear, well-structured llms.txt makes you a more citable source. That's GEO (generative engine optimization), the AEO-adjacent practice that's emerged alongside LLM-powered search.

## Spec maturity

**Community standard.** Defined at [llmstxt.org](https://llmstxt.org/) by Jeremy Howard, September 2024. No formal governance body, but widely adopted by major AI infrastructure providers (Anthropic, Cloudflare, Stripe, Vercel) and docs platforms (Mintlify, Docusaurus). The spec is short and stable — no breaking changes expected.

## Learn more

- [llmstxt.org](https://llmstxt.org/) — Specification
- [llms-full.txt](/kb/llms-full-txt) — The full-content companion
- [Anthropic llms.txt](https://docs.anthropic.com/llms.txt) — Reference implementation
- [Agent Readiness](/agent-readiness) — How llms.txt fits in the broader landscape


## Cross-references {#cross-references}

If your site also serves OpenAPI, MCP, SKILL.md, x402, A2A, or WebMCP, mention them inside llms.txt. The point of llms.txt is to be the *one document* an agent reads to learn what your service offers — if reading it doesn't reveal that you have a paid API or an MCP endpoint, the agent has to probe well-known paths on the chance they exist. Most won't.

A good cross-reference is a one-line bullet with the URL the agent can fetch:

- `POST /api/order` — Place an order (paid via x402 — see `/.well-known/x402.json`)
- MCP server: `/mcp`
- Full corpus for RAG: `/llms-full.txt`
- OpenAPI spec: `/openapi.json`
- SKILL.md (agent playbook): `/skill.md`
- A2A agent card: `/.well-known/agent.json`
- WebMCP manifest: `/.well-known/webmcp.json`

No special section is required — inline mentions count. AgentGrade's scan emits one optional sub-check per resource your site exposes: if the scanner detected your `/mcp` endpoint but `llms.txt` never says the word "mcp", that's a soft fail flagging the gap.

These are optional checks: they don't reduce your overall score, but they surface where your llms.txt is *less useful than it could be* for the agents reading it.


## What counts as meaningful content {#meaningful-content}

The llms.txt spec only requires an H1, but a file with just a title teaches an agent nothing. AgentGrade's *Meaningful content* check confirms the file has at least one of three structural signals that give an agent something to act on:

1. **A list of markdown links** — at least three bullets in `- [Title](url)` form. This is the docs-directory shape used by Anthropic, Stripe, Cloudflare, Vercel, Mintlify, and most platform llms.txt files. Each link points the agent at real content it can fetch and read.
2. **A fenced code block** — anything wrapped in triple-backticks. A `curl` example, a sample JSON response, a snippet showing how to authenticate. This is the API-surface shape: one file describes one service with a working example.
3. **A named operational section** — an H2 heading like `## Endpoints`, `## API`, `## Routes`, `## Authentication`, `## Examples`, `## Usage`, or `## Quick Start`. These names signal "this is operational content for agents" rather than marketing prose.

Any **one** of the three is enough to pass. Most well-formed llms.txt files hit at least two — for example, agentgrade.com's own file has link bullets in the Knowledge Base section *and* a `## Example` section *and* a `curl` fence.

### What fails

A marketing footer dressed up as an llms.txt:

```markdown
# Acme

> We make great products.

## Product

- Home
- About
- Pricing

## Legal

- Terms
- Privacy
```

Fifteen lines, passes the old line-count check, but: no markdown links (the bullets are plain text labels), no code block, no operational section. An agent reading this learns Acme exists and has pages — but nothing it can navigate to. Fails the *Meaningful content* check.

### Why links matter

llms.txt is a routing file, not a content file (that's what [llms-full.txt](/kb/llms-full-txt) is for). Its job is to point agents to the URLs they should fetch next. Plain-text bullets like `- About` give the agent a label with no URL — it can't follow the lead. A markdown link like `- [About](/about) — company background` gives it a URL plus a one-line description so it can decide whether the page is worth fetching.

The simplest fix when this check fails: turn your plain-text bullets into markdown links, or add one fenced example showing an agent how to call your service.
