Content Negotiation — Teaching Our Site to Speak AI

Content Negotiation — Teaching Our Site to Speak AI

Tech

Content Negotiation — Teaching Our Site to Speak AI

Every page on this site now serves markdown to bots and AI agents — using a 30-year-old HTTP feature.

K
Kim
5 min read

So you want to publish a blog in 2026. You spin up your framework, write some posts, hit deploy. Then you remember: oh right, the machines are reading too. Think about the accessibility features!

You add a sitemap.xml for Google. An RSS feed for the three people still using feed readers (respect). OpenGraph tags so your links don't look broken on Discord and Bluesky. JSON-LD so search engines understand your structured data. A robots.txt to be polite. Maybe even an llms.txt because that's the new thing.

That's a lot of machine-readable formats bolted onto what is essentially a pile of HTML.

are eating the web. Not maliciously — they're just doing what we built them to do. They crawl, they scrape, they try to understand. But right now everyone's doing it the hard way. Fetch a full HTML page, strip out the nav and the footer and the cookie banners, parse what's left, hope for the best.

Whataboutism if the machines could just ask for the format they want?

The 30-year-old feature nobody used

has had content negotiation since the beginning. The Accept header lets a client say "I'd prefer this format" and the server responds accordingly. Browsers send Accept: text/html and get HTML. An client sends Accept: application/json and gets .

I've used this pattern for years building APIs— same endpoint, different representations:

  • Send Accept: application/json and get JSON,
  • Send Accept: text/plain for debugging,
  • Send Accept: text/html for a rendered page.

One URL, one resource, multiple formats. as it was intended. You can still support .json and .xml extensions alongside it — but the Accept header means clients don't have to know about them.

Nobody really used content negotiation for websites though.

Browsers wanted HTML. The End.. Game over man...


Until now. Suddenly half your traffic isn't browsers anymore — it's GPTBot, ClaudeBot, PerplexityBot, and a growing swarm of agents that would really rather not parse a React/Svelte/HTMX/Angular component tree to find a paragraph of text.

This piece by Reading.sh about Cloudflare rolling out markdown responses at the CDN level inspired me to take the same pattern and apply it across the whole site. If a client asks for markdown, give them markdown. Same URL, different representation.

HTTP was designed to do this. We just never had a reason to use it for websites — until the machines showed up.

Welcome machines - We Love you long time

Every page on morgondag.io now supports content negotiation. The homepage, the about page, every game page, every news post — including this one.

When a request comes in we check a few things:

  1. Does the Accept header include text/markdown or text/plain?
  2. Is there a ?format=markdown query parameter?
  3. Is the User-Agent a known bot, crawler, or AI agent?
  4. Are browser-specific headers like Sec-Fetch-Mode missing?

If any match, we rewrite the request to a markdown route. Same URL to the outside world — clean, lightweight text instead of a full web app.

The response comes back with Content-Type: text/markdown and an x-markdown-tokens header estimating the token count for agents managing context windows.

Positive bot detection

We don't just wait for polite agents to send the right headers. We actively detect bots and serve them markdown automatically:

  • Known User-Agent patterns — Googlebot, GPTBot, ClaudeBot, PerplexityBot, curl, python-requests, and about 70 others
  • Missing Sec-Fetch-Mode — every real browser has sent this since 2020. Missing + no Mozilla/ in the UA = not a browser
  • Empty User-Agent — no legitimate browser omits this
  • Explicit opt-inAccept: text/markdown from any client

If you're a machine, you get machine-readable content. No DOM parsing, no guessing where the article starts.

Try it

curl https://morgondag.io/news/ai-content-negotiation -H 'Accept: text/markdown'
curl https://morgondag.io -H 'Accept: text/markdown'

Or skip headers entirely — https://morgondag.io/news/ai-content-negotiation?format=markdown.

Bots that want HTML can override the automatic markdown by sending Accept: text/html or appending ?format=html to the URL. Content negotiation goes both ways.

No new standards. No new file formats. No committee meetings. Just the Accept header doing what it was always meant to do.

This also opens up some fun ideas for the AI podcast & NPC generation project — imagine agents pulling structured episode data directly instead of scraping a page. Now they will 🚀


loading..


Thanks for reading. Come say hi on Discord or follow us on Steam.