Noteline
DownloadPricingOpen Web App

← All posts

June 7, 2026·7 min read

llms.txt, explained

llms.txt is a Markdown file at a site's root that gives AI models clean, curated content. Here's what it is, how to write one, and why it matters for GEO.

llms.txt is a Markdown file placed at the root of a website (/llms.txt) that gives large language models a clean, curated map of your most useful content. Instead of making an AI model crawl cluttered HTML full of navigation, ads, and scripts, you hand it a short, structured plain-text document. Jeremy Howard of Answer.AI proposed it in 2024. It is not an official web standard. It is an emerging convention, and adoption is still early.

That is the whole idea in a sentence. The rest of this post covers what goes in the file, how it differs from robots.txt and sitemaps, why it matters for generative engine optimization, and how to write one yourself.

What is llms.txt, exactly?

llms.txt is a single Markdown file you put at https://yoursite.com/llms.txt. It is written for software that parses prose, not page layout. Large language models do not need your CSS, your cookie banner, or your mega-menu. They need the words. A .md file gives them exactly that, with headings and links they can parse without guesswork.

The proposed format is loose but consistent. A valid file has:

  • An H1 with the site or project name (the only required line).
  • An optional blockquote summary that says what the site is in a sentence or two.
  • Optional paragraphs or lists with more context.
  • One or more ## sections holding lists of links, each link followed by an optional short note.

Howard also proposed a companion: llms-full.txt, a larger file that inlines the actual content instead of just linking to it, so a model can read everything in one fetch. Many docs sites also publish a .md version of each page (the same URL with .md appended) so an AI agent can grab clean Markdown rather than rendered HTML.

The key point is that Markdown is the shared format here. AI models are trained on huge amounts of it, they emit it by default, and they read it without choking on layout. If this idea interests you more broadly, see why Markdown is the language of AI.

How is llms.txt different from robots.txt and sitemaps?

This is the most common point of confusion. The three files live near each other at the site root, but they answer different questions.

File Audience Question it answers Format
robots.txt Crawlers "What are you allowed to access?" Plain text rules
sitemap.xml Search engines "What pages exist, and how fresh are they?" XML, exhaustive
llms.txt AI models / LLMs "What is worth reading, and where do I start?" Markdown, curated

robots.txt is about permission. It tells bots which paths to avoid. It does not describe your content, and it predates AI by decades.

A sitemap is about coverage. It lists every URL so a search engine can find them all. It is exhaustive, machine-only XML, not something a person or a model reads for meaning.

llms.txt is about curation and comprehension. It does not try to list everything. It points an AI model to the handful of pages that actually explain your product, your docs, or your ideas, and it does so in language the model can lift directly. Think of it less like a map of every street and more like a short "start here" note from someone who knows the place.

One honest caveat: llms.txt is a request and a courtesy, not an enforcement mechanism. It does not control whether a model trains on you, and most major AI crawlers do not yet look for the file. So the obvious next question is whether anyone reads it at all.

Does anyone actually use llms.txt yet?

The honest answer is mixed. As of 2026, llms.txt has real momentum among developer-tool and documentation companies, but the big AI engines have not committed to consuming it. Several major model providers have not confirmed that their crawlers read it. Plenty of documentation platforms now generate one automatically, and a growing directory of sites publish them, but do not expect a measurable traffic bump the week you ship one.

So why bother? A few reasons hold up regardless of official support:

  1. It costs almost nothing. A good llms.txt is a short file you write once and update occasionally.
  2. It forces clarity. Writing a curated map of your best content is a useful exercise even if no machine reads it.
  3. More people now reach information through AI assistants. Those assistants summarize and cite sources, so making your content easy to read is a low-risk bet on where attention is heading.
  4. Markdown is already your friend. If your content lives in Markdown, producing an llms.txt is trivial.

The fourth reason decides who moves first. The teams who adopt llms.txt fastest are the ones whose content is already plain text. If your notes and docs are locked inside a proprietary database, you have a conversion problem before you can even start. If they are already .md files, you are most of the way there.

Why llms.txt matters for generative engine optimization

Generative engine optimization (GEO) is the practice of making your content easy for AI engines to find, read, and cite, the way SEO targets traditional search. The mechanics differ. A search engine ranks a list of blue links. A generative engine reads sources and writes an answer, then maybe names where it came from.

llms.txt fits GEO in a specific way: it lowers the cost for a model to understand what you offer and to quote you correctly. When a model can pull clean Markdown that states plainly "this product does X, here are the docs for Y," it is far likelier to represent you accurately than if it has to reverse-engineer meaning from a JavaScript-heavy page.

GEO is not only about one file, though. The same habits that make llms.txt useful make all your content more AI-readable: clear headings, answer-first writing, plain language, and structured Markdown. llms.txt is the front door, but the pages it links to have to be clean too. For more on shaping content as model input, see Markdown as AI context.

How do you write an llms.txt file?

Here is a minimal, valid example for a small product site. Copy the shape, not the words.

# Acme Notes

> Acme Notes is a local-first Markdown editor. Every note is a
> plain .md file you own. No account, no lock-in.

Acme Notes runs on macOS, Windows, and Linux. Below are the
pages most useful for understanding what it does.

## Docs

- [Getting started](https://acme.example/docs/start): install and open your first folder
- [Keyboard shortcuts](https://acme.example/docs/keys): full reference
- [Export to Word and PDF](https://acme.example/docs/export): offline export details

## Background

- [Why plain text](https://acme.example/blog/why-plain-text): the case for files over databases

## Optional

- [Changelog](https://acme.example/changelog): release history

A few practical rules:

  • Keep the H1 and write a one-line summary in a blockquote. That summary is the single most quotable line in the file. Make it accurate and specific.
  • Curate ruthlessly. Link the pages that explain you best, not all of them. A focused twenty-link file beats a dump of two hundred.
  • Annotate every link with a short note. The note tells the model what it will find before it fetches.
  • Use an ## Optional section for lower-priority links. The convention treats this as content a model can skip when it is short on context.
  • Use absolute URLs. Models may read the file out of context.
  • Keep it current. A stale map is worse than none. Update it when your important pages change.

If you maintain a docs site, consider also publishing a .md version of each page and an llms-full.txt that inlines key content. But start with the basic file. Done and accurate beats elaborate and out of date.

The bigger picture: own your text

llms.txt only works if you have clean text to point at, and that is the real lesson of the proposal. The sites that adapt fastest to AI readers are the ones whose content already lives as plain, portable Markdown rather than trapped in a format only one app can open. Plain text is the durable substrate; tools come and go, but a .md file you own keeps working.

This is the bet behind Noteline. Your notes are plain .md files in a folder you control, which means they are already in the format an AI model can read without conversion. Whether or not you ever publish an llms.txt, writing in plain text keeps your work readable by the next tool, the next model, and you, years from now. You can try the free web editor without an account.

llms.txt is early, optional, and unenforced. It is also a small, sensible step in a clear direction: making the content you already own easy for machines to read. If your words are already plain text, you have nothing to convert and everything to gain.

Your notes should be files you own.

Noteline keeps every note as a plain .md file in your folder — Word & PDF export offline, AI answers paste in clean. Free for 30 days, then $4.99 once.

Download NotelineOr try the web editor →

Keep reading

  • AI Orchestration with Markdown Files
  • Context engineering with plain files
  • What AI-Native Note-Taking Actually Means

Noteline — Markdown notes that are just files. Download · Pricing

© 2026 Noteline