AIOField notes

llms.txt in 2026, what we actually put in the file

A line by line walk through of the llms.txt we ship on every RankSmith build, and the annotation rules we follow so LLMs pick the right entry points.

RankSmith editorial17 April 20266 minute read

Every RankSmith build ships with /llms.txt and /llms-full.txt. We have been putting them on client sites since late 2024 and on our own site since day one. Two years in, the honest answer is that no AI company has publicly committed to consuming the file in production, the cost of shipping it is close to zero, and the downside of not shipping it is a future where the bigger crawlers add support and your site is the one that was not ready. This is the exact file we ship and the rules we follow when writing it.

What is llms.txt and which engines actually read it in 2026?

The standard was proposed by Jeremy Howard at Answer.AI in September 2024 and documented at llmstxt.org. It is intentionally narrow. Three structural elements. An H1 site name. A blockquote summary. A series of H2 sections that each hold a bullet list of annotated links. No ceremony, no schema, no JSON.

The honest answer on adoption is that no public statement from OpenAI, Anthropic, Google, or Perplexity in 2026 confirms the file is consumed as a production retrieval signal. What does exist is a growing list of sites shipping one, including Anthropic, Vercel, Stripe, Cloudflare, and Supabase. The developer hypothesis is that the cost of the file is an afternoon of writing, and the upside is that whichever model starts reading it first gets a curated view of the site that no generic crawl can match. That is the bet we take on every RankSmith build.

FigureAnatomy of an llms.txt file. Five canonical sections. The blockquote summary and each annotated link row are the pieces a model will read most closely, highlighted in amber.

What goes in llms.txt versus llms-full.txt?

The spec at llmstxt.org introduces two files. The first, /llms.txt, is the index. Short, scannable, all of it inside a single model context window. The second, /llms-full.txt, is optional but recommended. It carries the expanded prose. Think of it as the whole marketing site concatenated, with the same answer first voice, so a model can ingest the full context without following twenty links.

We keep /llms.txt under two hundred lines on every RankSmith build. Our current file is thirty lines. /llms-full.txt sits at around two hundred and fifty lines because we include full service descriptions, pricing tiers, and the operating principles we talk about internally. If a future crawler picks either file, both are ready.

Two practical rules we follow:

The index file never duplicates prose that belongs in /llms-full.txt. A link with a one line description is enough.
The full file is not a copy of the homepage. It is written specifically as a long form answer document, voice identical to the site, content reorganised so each H2 stands alone.

The exact llms.txt RankSmith ships on every build.

Here is the current file at ranksmith.co.za/llms.txt, section by section.

# RankSmith

> RankSmith is a South African digital agency operated by Tapnet Solutions (Pty) Ltd.
> We build premium Next.js websites and rank brands across Google, ChatGPT, Claude,
> Perplexity, Gemini, and Bing Copilot. Based in Johannesburg, Gauteng, serving
> South Africa and beyond. Contact: wynand@tapnet.co.za or 079 174 8357.

The H1 is the brand name. Nothing else. The blockquote carries three things in order. What we do. Where we are. How to contact us. Every llms.txt we have seen that gets respect follows this exact shape. Anthropic, Stripe, Vercel. Short H1, dense blockquote, one paragraph.

## Services

- [SEO, Search Engine Optimization](https://ranksmith.co.za/services/seo): How RankSmith ranks South African brands on Google for commercial intent queries.
- [AEO, Answer Engine Optimization](https://ranksmith.co.za/services/aeo): How RankSmith gets brands cited inside ChatGPT, Claude, Perplexity, and Gemini responses.

Every services link follows the same pattern. Link text names the service with both abbreviation and full phrase, so the model indexes both. The annotation starts with "How RankSmith" so the context becomes a first person reading of the service. The model reads thirty lines and walks away with a verifiable entity model of the business.

## Company

- [About RankSmith](https://ranksmith.co.za/about): Who we are, where we are based, and how we work.
- [Our Work](https://ranksmith.co.za/work): Selected live production sites shipped by RankSmith for South African clients ...

Company section is where trust signals live. Real client names on the /work annotation. Real legal entity on /about. Real payment terms on /pricing. This is the section that gives a model something to cite when someone asks "is RankSmith legitimate".

The optional section holds the legal stack. POPIA, PAIA, terms, operator agreements. A future crawler routing a compliance query uses this section. Today it is a file the lawyers read. In a year it might be the section ChatGPT reads when a user asks whether the agency is compliant.

How do you write annotations that LLMs will pick as entry points?

FigureHow a future LLM crawler reads llms.txt. Step one fetches the file to build a reading list. Steps two through five fetch the prioritised pages in order. The file tells the model where to start and what to expect.

Three annotation patterns we use on every RankSmith link.

Start with a verb and the brand name. "How RankSmith ranks South African brands on Google" reads cleanly as a standalone sentence. "Our SEO service page" does not.

Name the outcome, not the activity. "Gets brands cited inside ChatGPT" is an outcome. "Helps with AEO" is not.

Keep each annotation under twenty words. If you need more, the link belongs in /llms-full.txt where a full paragraph is fine, not in the short index where a paragraph is noise.

One test we run on every draft annotation. Paste it into ChatGPT with the prompt "what would this page contain". The answer should describe the page accurately. If the model has to guess, rewrite the annotation.

When is llms.txt worth writing, and when is it theatre?

FigureThree files, three audiences. robots.txt sets crawler permissions. sitemap.xml lists every URL. llms.txt is the hand curated reading list. All three ship together on every RankSmith build. The amber column is the newest file and the one still finding its place.

The file is not a replacement for anything. /robots.txt tells a crawler which paths it may visit. /sitemap.xml is a machine readable list of every URL on the site. /llms.txt is a curated, annotated reading list that points to the pages that matter most. The three do different jobs and all three ship together on every RankSmith build.

When to write one:

You already have Organization, LocalBusiness, Service, and FAQPage schema firing on the right pages.
Your URLs are stable and your 301 map is clean.
Your headings are answer first and your first sentences contain real numbers.

When not to:

Your site is still on a legacy CMS and half the URLs are query strings.
Your homepage speaks in marketing slogans and your pricing is "contact us for a quote".
Your schema is missing or broken.

In the second case, write the file last. Fix the underlying content first. A clean llms.txt pointing to vague pages is a louder signal that the brand is performative, and models will learn to weight that signal against you.

We ship /llms.txt and /llms-full.txt on every Next.js website we build. They are set up with a one hour cache and a correct content type header on every request. You can see ours at ranksmith.co.za/llms.txt and the expanded version at ranksmith.co.za/llms-full.txt.

If you want us to audit yours, or write the first pair for you, start with the free audit or book a strategy call.

Frequently asked questions

Do ChatGPT or Claude actually read llms.txt today?

There is no public confirmation from OpenAI or Anthropic that their production search crawlers consume llms.txt as a ranking or retrieval signal in 2026. It is a low cost file to ship that future crawlers are likely to adopt. We ship it on every RankSmith build for that reason, not because it is measurably lifting citations today.

What is the difference between llms.txt and llms-full.txt?

llms.txt is the index, under two hundred lines, annotated links to the pages that matter. llms-full.txt is the expanded context file, three to ten times larger, with full prose a crawler can ingest without following links. The two sit side by side at the domain root.

Does llms.txt replace robots.txt or sitemap.xml?

No. robots.txt controls which paths a crawler may visit. sitemap.xml lists every URL. llms.txt is a hand curated reading list that tells an LLM which pages to read first and why. All three ship together.

How long should my llms.txt be?

Short enough for a model to hold in one context window. Under two hundred lines is a good target. If you need more, split content into llms-full.txt and link from the short file.

More field notes

All field notes

AEO7 min read

How to write an answer first H2 that ChatGPT will actually cite

The exact heading and first sentence structure we use on RankSmith sites, and why it gets lifted as a citation more often than a well optimised blog post.

17 April 2026Read

Next.js14 min read

WordPress to Next.js, a migration checklist that preserves rankings

The forty step migration checklist RankSmith uses on every WordPress to Next.js move, including the 301 map, the schema port, and the live cutover window.

17 April 2026Read

Ready when you are

Want this level of work on your site?

Book a thirty minute strategy call. We will audit your current rankings in Google and in AI engines, and map the fastest wins we can ship in the next sixty days.

Book a strategy call See our pricing