Content Negotiation for AI Agents
What is content negotiation for agents?
Content negotiation means serving different responses from the same URL based on what the client asks for. For AI agents, this means returning markdown or plain text — the format LLMs already parse well — when the request comes from an agent, while still serving HTML to browsers. Same URL, different representation, per RFC 9110 §12.5.
Why it matters
LLM-driven agents typically receive only the response body, not HTTP headers, status codes, or redirect chains. Serving HTML forces the agent to parse DOM structure, strip layout, and discard navigation — wasting tokens before the model sees anything useful. A clean text response gives the model the actual content directly.
The Accept preference order gotcha
The most common content-negotiation bug is treating the Accept header as a simple substring check. Take Claude Code's WebFetch tool — it sends:
Accept: text/markdown, text/html, */*
This is the client saying, in preference order: "I'd prefer markdown if you have it, otherwise HTML, otherwise anything." A naive check like if (accept.includes('text/html')) sees text/html in the string and serves HTML — ignoring that text/markdown was listed first.
Per RFC 9110 §12.5.1, when q-values are not specified the order of media types expresses preference. A correct implementation parses the Accept list, applies q-values, and picks the leftmost type the server can serve.
What AgentGrade checks
Agent UA gets non-HTML — We send User-Agent: claude-code/1.0.0 with Accept: text/markdown, text/html, */* to your homepage. The check passes if you serve text/markdown, text/plain, or application/json with body ≥20 bytes. Sites that substring-match Accept and serve HTML fail this.
Accept: JSON returns JSON — We send Accept: application/json and check for valid JSON.
Accept: text returns text — We send Accept: text/plain and check for plain text or markdown.
Accept: markdown returns markdown — We send Accept: text/markdown and check for markdown or plain text.
Vary: Accept set — When you negotiate, the response must include Vary: Accept so shared caches key entries correctly.
How to implement it correctly
Use a proper Accept negotiator instead of substring matching:
// Express — req.accepts uses the negotiator package under the hood
app.get('/', async (req, res) => {
res.vary('Accept');
const best = req.accepts(['text/html', 'text/markdown', 'text/plain', 'application/json']);
if (best === 'text/markdown' || best === 'text/plain') {
return res.type(best).send(await buildLlmsTxt());
}
if (best === 'application/json') {
return res.json({ name: 'Your Service', api: '/openapi.json' });
}
res.sendFile('index.html');
});
Other ecosystems:
- Node.js (no framework):
negotiatornpm package - Python:
werkzeug.wrappers.AcceptMixinorrequest.accept_mimetypes.best_match - Go:
github.com/markusthoemmes/goautoneg - Ruby on Rails:
respond_to do |format|blocks handle response generation, but Rails treats*/*in Accept as a license to serve HTML —Accept: text/markdown, */*returns HTML even though markdown is preferred. Fix by settingrequest.formatexplicitly in abefore_actionbased on the first non-wildcard Accept type, beforerespond_toruns. Reorderingformat.Xblocks alone won't override the*/*fallback. - Cloudflare Workers: parse
request.headers.get('Accept')manually or use theacceptnpm package
Inline vs redirect — pick inline
There are two ways to serve agent-friendly content. Inline is better:
Inline (recommended): Same URL serves different bodies based on Accept.
GET / → 200 OK
Content-Type: text/html (browser) | text/markdown (agent)
Redirect (legacy): Send agents to /llms.txt.
GET / → 302 Found, Location: /llms.txt
GET /llms.txt → 200 OK
Inline wins because: (1) one fetch instead of two — half the latency; (2) the URL the agent reports to the user is the URL they were asked about, not a redirect target; (3) caching is cleaner with Vary: Accept. The /llms.txt route still exists for tools that fetch it directly — both routes call the same content function so there's one source of truth.
Vary: Accept is load-bearing
Whenever the same URL returns different bodies based on Accept, set Vary: Accept. This tells shared caches (CDNs, proxies, browsers) that the cache key must include the Accept header value.
Without it, a CDN could cache the markdown response from one agent fetch and serve it to a browser visit — or the reverse. The Vary header is the only thing that keeps cache entries from being interchangeable when the bodies are not.
Known AI agent User-Agents
| Agent | User-Agent | Purpose |
|---|---|---|
| ClaudeBot | Mozilla/5.0 (compatible; ClaudeBot/1.0; +claudebot@anthropic.com) | Anthropic training crawler |
| Claude-User | Mozilla/5.0 ... (compatible; Claude-User/1.0; +Claude-User@anthropic.com) | claude.ai web_fetch, Claude API web_search page reads |
| Claude-SearchBot | (string not published) | Anthropic search index crawler |
| claude-code | claude-code/<version> | Claude Code CLI WebFetch tool |
| ChatGPT-User | Mozilla/5.0 ... (compatible; ChatGPT-User/1.0; +https://openai.com/bot) | User-initiated ChatGPT browse |
| OAI-SearchBot | Mozilla/5.0 ... (compatible; OAI-SearchBot/1.3; +https://openai.com/searchbot) | OpenAI search index |
| OAI-AdsBot | Mozilla/5.0 ... (compatible; OAI-AdsBot/1.0; +https://openai.com/adsbot) | OpenAI ads crawler |
| GPTBot | Mozilla/5.0 ... (compatible; GPTBot/1.3; +https://openai.com/gptbot) | OpenAI training crawler |
| PerplexityBot | Mozilla/5.0 ... (compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot) | Perplexity search-results crawler |
| Perplexity-User | Mozilla/5.0 ... (compatible; Perplexity-User/1.0; +https://perplexity.ai/perplexity-user) | User-initiated Perplexity fetches |
| Google-Extended | (uses Googlebot UA) | Google Gemini training, controlled via robots.txt |
Web Bot Auth — the next signal
A growing list of agents (ChatGPT Agent confirmed today; Anthropic, Perplexity, Google expected) cryptographically sign their requests per RFC 9421 HTTP Message Signatures. The signal is the Signature-Agent request header:
Signature-Agent: "https://chatgpt.com"
Signature-Input: sig=("@authority" "signature-agent"); keyid="..."; tag="web-bot-auth"
Signature: sig=...
If you see Signature-Agent on an incoming request, treat the client as a known agent even if the UA looks like a browser. For full verification, fetch the JWKS at the host named in Signature-Agent (/.well-known/http-message-signatures-directory) and verify the signature with the web-bot-auth npm package. For content-negotiation purposes, presence of the header alone is a sufficient soft signal.
Learn more
- RFC 9110 §12.5 — Content Negotiation
- llms.txt specification
- Anthropic crawler docs
- OpenAI bot docs
- Perplexity crawler docs
- Cloudflare Web Bot Auth
Preferred type vs non-HTML — the next bar
"Agent UA gets non-HTML" is a basic check: did the server serve anything other than HTML? "Returns preferred Content-Type" is the strict version: did the response Content-Type match the leading type the client signaled?
For example, when the client sends Accept: text/markdown, text/html, */*:
- Server returns
Content-Type: text/markdown→ passes both checks - Server returns
Content-Type: text/plain→ passes the basic check, fails the strict one - Server returns
Content-Type: text/html→ fails both
The scanner runs four probes: markdown leading (the Claude Code / Cursor pattern), HTML leading with markdown listed second (the browser / ChatGPT Agent pattern — catches sites that ignore client order and use server-side preference instead), explicit q-values favoring HTML over markdown (catches sites that ignore q-values entirely), and JSON leading (programmatic discovery pattern). All four must pass. Today the Content-Type label is mostly decorative for LLM-based agents — they parse the body bytes regardless of MIME type. But browser-based AI extensions and emerging MCP tools branch on Content-Type, and the gap will widen as the ecosystem matures.
The fix is a one-line change in your handler: set the response Content-Type from the negotiated type, not a hardcoded value. If your code already returns text/plain for both Accept: text/plain and Accept: text/markdown, branch on the negotiated type and label accordingly.
This check is required — failing it costs points in the Content Negotiation group.
Diagnosing your bug — q-values and the three patterns
How q-values work
When a client sends multiple types in Accept, it can attach q-values (quality factors) between 0.0 and 1.0 to express relative preference:
Accept: text/markdown;q=1.0, text/html;q=0.5, */*;q=0.1
Meaning: "I really want markdown. I'll take HTML as a backup. Anything else is a last resort."
When no q-value is given, it defaults to 1.0. So Accept: text/markdown, text/html, */* means all three are equally preferred — and the order in the header breaks the tie. A correct server picks markdown.
A proper Accept negotiator (Express req.accepts(), the negotiator npm package, Python werkzeug, Go goautoneg) handles all this automatically: parse q-values, honor order on ties, pick the best match the server can serve.
The three bug patterns we see in the wild
If your site fails the "Agent UA gets non-HTML" check, the cause is almost always one of these:
Pattern 1: Substring matching. Code that checks if the Accept header contains a type, in a fixed if-else order. Example:
// WRONG — order of checks, not order in Accept, wins
if (accept.includes('text/html')) return html;
else if (accept.includes('text/markdown')) return markdown;
Client sends Accept: text/markdown, text/html → server returns HTML because text/html is in the string. Preference order from the client is ignored entirely.
Pattern 2: Framework default that serves HTML on */*. Some frameworks treat */* in Accept as a license to fall back to HTML, even when explicit non-HTML types are listed earlier. Rails 8's respond_to is a notable example:
Accept: text/markdown, */* → Rails returns HTML (markdown ignored)
Accept: text/markdown, text/html → Rails returns markdown (no */*, honors order)
Pattern 3: Server-internal preference order + q-values ignored. Server has its own priority list (often hardcoded somewhere) and picks whichever type from the Accept header is highest on the server's list — not on the client's list. q-values aren't parsed at all:
Accept: text/plain;q=0.9, text/html;q=0.5 → returns HTML
(server prefers html despite q-values
explicitly favoring plain)
The smoking gun for pattern 3 is q-values being ignored. If the same site returns HTML for the row above and markdown for Accept: text/plain, text/markdown (markdown won despite plain being listed first), it's pattern 3.
A quick diagnostic test
Run these five curl commands against your homepage. The pattern in the responses tells you which bug you have:
curl -sI -H "Accept: text/markdown" YOUR_SITE/
curl -sI -H "Accept: text/markdown, text/html, */*" YOUR_SITE/
curl -sI -H "Accept: text/plain, text/markdown" YOUR_SITE/
curl -sI -H "Accept: text/markdown, */*" YOUR_SITE/
curl -sI -H "Accept: text/plain;q=0.9, text/html;q=0.5" YOUR_SITE/
- If only the first returns markdown: pattern 1 (substring matching).
- If the first three return markdown but the fourth returns HTML: pattern 2 (
*/*fallback). - If the fifth returns HTML and the third returns markdown (or vice versa with whatever you think you serve): pattern 3 (server preference + q-values ignored).
The fix recipe is the same in all three cases: replace whatever ad-hoc selection logic you have with a proper Accept negotiator from the list above.
Inline vs 302 redirect — what to do
Two patterns for serving agent-friendly content at your homepage:
Inline — same URL returns different bodies based on Accept header.
GET / + Accept: text/html → 200 + HTML
GET / + Accept: text/markdown → 200 + markdown
Redirect — server sends agent requests to a separate canonical URL.
GET / + Accept: text/markdown → 302, Location: /llms.txt
GET /llms.txt → 200 + markdown
Use inline. It is the documented best practice in RFC 9110 §12.2, which explicitly lists the disadvantages of redirect-based (reactive) negotiation: "suffers from transmitting a list of alternatives... and needing a second request to obtain an alternate representation" and "does not define a mechanism for supporting automatic selection."
Every major content-negotiation-aware site we tested uses inline:
- GitHub API — same URL varies on Accept (
application/vnd.github+jsonvsapplication/vnd.github.html+json), no redirect - Stripe docs —
docs.stripe.com/apireturns HTML or markdown from the same URL withVary: Accept - Cloudflare developer docs — edge converts inline, same URL
- Vercel, Mintlify, Sanity — all recommend inline in their public guidance for agent-friendly pages
Why inline wins concretely
- Half the latency. One HTTP fetch instead of two. HTTP/2 and HTTP/3 multiplexing do not eliminate the redirect cost — the client still has to receive the 302, parse
Location, and issue a new request. - URL fidelity. The URL the agent reports to the user is the URL the user actually asked about. With 302, the agent ends up at
/llms.txt— a different URL than the homepage. - Cleaner caching. Inline with
Vary: Acceptlets caches store both representations under one URL key. With 302, caches have to handle two URLs and keep their coherence. - No magic agent-only URL. Inline keeps the URL space unified — humans and agents hit the same URL; the server picks what to serve based on Accept.
What AgentGrade checks
The Inline content negotiation check sends an agent-shaped request (claude-code/1.0.0 UA with Accept: text/markdown, text/html, */*) and verifies the response does not end up at a different URL than a browser request would. Specifically: if the agent fetch was redirected to a path that the browser fetch was not, the check fails.
Universal redirects that affect everyone (HTTPS upgrade, trailing-slash normalization) are not penalized — only agent-specific redirects to a separate URL.
How to fix
Replace your 302 logic with inline negotiation. Express example:
app.get('/', async (req, res) => {
res.vary('Accept');
const best = req.accepts(['text/html', 'text/markdown', 'text/plain']);
if (best === 'text/markdown') {
return res.type('text/markdown').send(await buildLlmsTxt());
}
// Honor browser preference
res.sendFile('index.html');
});
The /llms.txt route can still exist as a separate URL — both routes call the same content function. Sites that fetch /llms.txt directly still work; sites that hit / with an agent Accept also work, in one request.
This check is emerging (optional) today — it does not yet penalize sites that use 302. It will graduate to required once industry adoption of inline is broad enough that the few remaining 302-based sites are clearly the outliers.