AgentGrade
EnglishEspañol日本語中文
← Knowledge Base

Content Negotiation for AI Agents

What is content negotiation for agents?

Content negotiation means serving different responses from the same URL based on what the client asks for. For AI agents, this means returning markdown or plain text — the format LLMs already parse well — when the request comes from an agent, while still serving HTML to browsers. Same URL, different representation, per RFC 9110 §12.5.

Why it matters

LLM-driven agents typically receive only the response body, not HTTP headers, status codes, or redirect chains. Serving HTML forces the agent to parse DOM structure, strip layout, and discard navigation — wasting tokens before the model sees anything useful. A clean text response gives the model the actual content directly.

The Accept preference order gotcha

The most common content-negotiation bug is treating the Accept header as a simple substring check. Take Claude Code's WebFetch tool — it sends:

Accept: text/markdown, text/html, */*

This is the client saying, in preference order: "I'd prefer markdown if you have it, otherwise HTML, otherwise anything." A naive check like if (accept.includes('text/html')) sees text/html in the string and serves HTML — ignoring that text/markdown was listed first.

Per RFC 9110 §12.5.1, when q-values are not specified the order of media types expresses preference. A correct implementation parses the Accept list, applies q-values, and picks the leftmost type the server can serve.

What AgentGrade checks

Agent UA gets non-HTML — We send User-Agent: claude-code/1.0.0 with Accept: text/markdown, text/html, */* to your homepage. The check passes if you serve text/markdown, text/plain, or application/json with body ≥20 bytes. Sites that substring-match Accept and serve HTML fail this.

Accept: JSON returns JSON — We send Accept: application/json and check for valid JSON.

Accept: text returns text — We send Accept: text/plain and check for plain text or markdown.

Accept: markdown returns markdown — We send Accept: text/markdown and check for markdown or plain text.

Vary: Accept set — When you negotiate, the response must include Vary: Accept so shared caches key entries correctly.

How to implement it correctly

Use a proper Accept negotiator instead of substring matching:

// Express — req.accepts uses the negotiator package under the hood
app.get('/', async (req, res) => {
  res.vary('Accept');
  const best = req.accepts(['text/html', 'text/markdown', 'text/plain', 'application/json']);
  if (best === 'text/markdown' || best === 'text/plain') {
    return res.type(best).send(await buildLlmsTxt());
  }
  if (best === 'application/json') {
    return res.json({ name: 'Your Service', api: '/openapi.json' });
  }
  res.sendFile('index.html');
});

Other ecosystems:

Inline vs redirect — pick inline

There are two ways to serve agent-friendly content. Inline is better:

Inline (recommended): Same URL serves different bodies based on Accept.

GET / → 200 OK
  Content-Type: text/html (browser) | text/markdown (agent)

Redirect (legacy): Send agents to /llms.txt.

GET / → 302 Found, Location: /llms.txt
GET /llms.txt → 200 OK

Inline wins because: (1) one fetch instead of two — half the latency; (2) the URL the agent reports to the user is the URL they were asked about, not a redirect target; (3) caching is cleaner with Vary: Accept. The /llms.txt route still exists for tools that fetch it directly — both routes call the same content function so there's one source of truth.

Vary: Accept is load-bearing

Whenever the same URL returns different bodies based on Accept, set Vary: Accept. This tells shared caches (CDNs, proxies, browsers) that the cache key must include the Accept header value.

Without it, a CDN could cache the markdown response from one agent fetch and serve it to a browser visit — or the reverse. The Vary header is the only thing that keeps cache entries from being interchangeable when the bodies are not.

Known AI agent User-Agents

AgentUser-AgentPurpose
ClaudeBotMozilla/5.0 (compatible; ClaudeBot/1.0; +claudebot@anthropic.com)Anthropic training crawler
Claude-UserMozilla/5.0 ... (compatible; Claude-User/1.0; +Claude-User@anthropic.com)claude.ai web_fetch, Claude API web_search page reads
Claude-SearchBot(string not published)Anthropic search index crawler
claude-codeclaude-code/<version>Claude Code CLI WebFetch tool
ChatGPT-UserMozilla/5.0 ... (compatible; ChatGPT-User/1.0; +https://openai.com/bot)User-initiated ChatGPT browse
OAI-SearchBotMozilla/5.0 ... (compatible; OAI-SearchBot/1.3; +https://openai.com/searchbot)OpenAI search index
OAI-AdsBotMozilla/5.0 ... (compatible; OAI-AdsBot/1.0; +https://openai.com/adsbot)OpenAI ads crawler
GPTBotMozilla/5.0 ... (compatible; GPTBot/1.3; +https://openai.com/gptbot)OpenAI training crawler
PerplexityBotMozilla/5.0 ... (compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)Perplexity search-results crawler
Perplexity-UserMozilla/5.0 ... (compatible; Perplexity-User/1.0; +https://perplexity.ai/perplexity-user)User-initiated Perplexity fetches
Google-Extended(uses Googlebot UA)Google Gemini training, controlled via robots.txt

Web Bot Auth — the next signal

A growing list of agents (ChatGPT Agent confirmed today; Anthropic, Perplexity, Google expected) cryptographically sign their requests per RFC 9421 HTTP Message Signatures. The signal is the Signature-Agent request header:

Signature-Agent: "https://chatgpt.com"
Signature-Input: sig=("@authority" "signature-agent"); keyid="..."; tag="web-bot-auth"
Signature: sig=...

If you see Signature-Agent on an incoming request, treat the client as a known agent even if the UA looks like a browser. For full verification, fetch the JWKS at the host named in Signature-Agent (/.well-known/http-message-signatures-directory) and verify the signature with the web-bot-auth npm package. For content-negotiation purposes, presence of the header alone is a sufficient soft signal.

Learn more

Preferred type vs non-HTML — the next bar

"Agent UA gets non-HTML" is a basic check: did the server serve anything other than HTML? "Returns preferred Content-Type" is the strict version: did the response Content-Type match the leading type the client signaled?

For example, when the client sends Accept: text/markdown, text/html, */*:

The scanner runs four probes: markdown leading (the Claude Code / Cursor pattern), HTML leading with markdown listed second (the browser / ChatGPT Agent pattern — catches sites that ignore client order and use server-side preference instead), explicit q-values favoring HTML over markdown (catches sites that ignore q-values entirely), and JSON leading (programmatic discovery pattern). All four must pass. Today the Content-Type label is mostly decorative for LLM-based agents — they parse the body bytes regardless of MIME type. But browser-based AI extensions and emerging MCP tools branch on Content-Type, and the gap will widen as the ecosystem matures.

The fix is a one-line change in your handler: set the response Content-Type from the negotiated type, not a hardcoded value. If your code already returns text/plain for both Accept: text/plain and Accept: text/markdown, branch on the negotiated type and label accordingly.

This check is required — failing it costs points in the Content Negotiation group.

Diagnosing your bug — q-values and the three patterns

How q-values work

When a client sends multiple types in Accept, it can attach q-values (quality factors) between 0.0 and 1.0 to express relative preference:

Accept: text/markdown;q=1.0, text/html;q=0.5, */*;q=0.1

Meaning: "I really want markdown. I'll take HTML as a backup. Anything else is a last resort."

When no q-value is given, it defaults to 1.0. So Accept: text/markdown, text/html, */* means all three are equally preferred — and the order in the header breaks the tie. A correct server picks markdown.

A proper Accept negotiator (Express req.accepts(), the negotiator npm package, Python werkzeug, Go goautoneg) handles all this automatically: parse q-values, honor order on ties, pick the best match the server can serve.

The three bug patterns we see in the wild

If your site fails the "Agent UA gets non-HTML" check, the cause is almost always one of these:

Pattern 1: Substring matching. Code that checks if the Accept header contains a type, in a fixed if-else order. Example:

// WRONG — order of checks, not order in Accept, wins
if (accept.includes('text/html')) return html;
else if (accept.includes('text/markdown')) return markdown;

Client sends Accept: text/markdown, text/html → server returns HTML because text/html is in the string. Preference order from the client is ignored entirely.

Pattern 2: Framework default that serves HTML on */*. Some frameworks treat */* in Accept as a license to fall back to HTML, even when explicit non-HTML types are listed earlier. Rails 8's respond_to is a notable example:

Accept: text/markdown, */*       → Rails returns HTML (markdown ignored)
Accept: text/markdown, text/html → Rails returns markdown (no */*, honors order)

Pattern 3: Server-internal preference order + q-values ignored. Server has its own priority list (often hardcoded somewhere) and picks whichever type from the Accept header is highest on the server's list — not on the client's list. q-values aren't parsed at all:

Accept: text/plain;q=0.9, text/html;q=0.5 → returns HTML
                                            (server prefers html despite q-values
                                             explicitly favoring plain)

The smoking gun for pattern 3 is q-values being ignored. If the same site returns HTML for the row above and markdown for Accept: text/plain, text/markdown (markdown won despite plain being listed first), it's pattern 3.

A quick diagnostic test

Run these five curl commands against your homepage. The pattern in the responses tells you which bug you have:

curl -sI -H "Accept: text/markdown" YOUR_SITE/
curl -sI -H "Accept: text/markdown, text/html, */*" YOUR_SITE/
curl -sI -H "Accept: text/plain, text/markdown" YOUR_SITE/
curl -sI -H "Accept: text/markdown, */*" YOUR_SITE/
curl -sI -H "Accept: text/plain;q=0.9, text/html;q=0.5" YOUR_SITE/

The fix recipe is the same in all three cases: replace whatever ad-hoc selection logic you have with a proper Accept negotiator from the list above.

Inline vs 302 redirect — what to do

Two patterns for serving agent-friendly content at your homepage:

Inline — same URL returns different bodies based on Accept header.

GET /  + Accept: text/html      →  200 + HTML
GET /  + Accept: text/markdown  →  200 + markdown

Redirect — server sends agent requests to a separate canonical URL.

GET /  + Accept: text/markdown  →  302, Location: /llms.txt
GET /llms.txt                   →  200 + markdown

Use inline. It is the documented best practice in RFC 9110 §12.2, which explicitly lists the disadvantages of redirect-based (reactive) negotiation: "suffers from transmitting a list of alternatives... and needing a second request to obtain an alternate representation" and "does not define a mechanism for supporting automatic selection."

Every major content-negotiation-aware site we tested uses inline:

Why inline wins concretely

  1. Half the latency. One HTTP fetch instead of two. HTTP/2 and HTTP/3 multiplexing do not eliminate the redirect cost — the client still has to receive the 302, parse Location, and issue a new request.
  2. URL fidelity. The URL the agent reports to the user is the URL the user actually asked about. With 302, the agent ends up at /llms.txt — a different URL than the homepage.
  3. Cleaner caching. Inline with Vary: Accept lets caches store both representations under one URL key. With 302, caches have to handle two URLs and keep their coherence.
  4. No magic agent-only URL. Inline keeps the URL space unified — humans and agents hit the same URL; the server picks what to serve based on Accept.

What AgentGrade checks

The Inline content negotiation check sends an agent-shaped request (claude-code/1.0.0 UA with Accept: text/markdown, text/html, */*) and verifies the response does not end up at a different URL than a browser request would. Specifically: if the agent fetch was redirected to a path that the browser fetch was not, the check fails.

Universal redirects that affect everyone (HTTPS upgrade, trailing-slash normalization) are not penalized — only agent-specific redirects to a separate URL.

How to fix

Replace your 302 logic with inline negotiation. Express example:

app.get('/', async (req, res) => {
  res.vary('Accept');
  const best = req.accepts(['text/html', 'text/markdown', 'text/plain']);
  if (best === 'text/markdown') {
    return res.type('text/markdown').send(await buildLlmsTxt());
  }
  // Honor browser preference
  res.sendFile('index.html');
});

The /llms.txt route can still exist as a separate URL — both routes call the same content function. Sites that fetch /llms.txt directly still work; sites that hit / with an agent Accept also work, in one request.

This check is emerging (optional) today — it does not yet penalize sites that use 302. It will graduate to required once industry adoption of inline is broad enough that the few remaining 302-based sites are clearly the outliers.