Highlights and Question formats are now live. Get grounded answers or verbatim excerpts from any page in one call. Try it now →

How to Create an llms.txt File for Any Website

placeholderEric Ciarla
May 22, 2026 (updated)

TL;DR

  • What it is: A tool that creates llms.txt files, standardized markdown documents that help AI understand your website
  • How it works: Firecrawl crawls your site and formats content into LLM-friendly files using gpt-4o-mini
  • Access: Use the web interface at llmstxt.firecrawl.dev or call the API directly
  • Cost: Free to use; optional Firecrawl API key removes limits
  • Why it matters: Improves how AI assistants, chatbots, and coding tools interact with your content
  • May 2026 update: Google added llms.txt to Chrome Lighthouse's new "Agentic Browsing" audit category, signaling it as a readiness check for AI agent interactions

AI tools often misunderstand websites because they lack structured context. The llms.txt standard fixes this by providing a clear, machine-readable overview of your site, like a sitemap for AI. The llms.txt Generator, powered by Firecrawl, automates the entire process, turning your website into structured files that LLMs can accurately understand.

What is an llms.txt file?

An llms.txt file is a standardized markdown file proposed by Jeremy Howard to provide information to help LLMs use a website at inference time.

Unlike traditional web content designed for human readers, llms.txt files offer concise, structured information that LLMs can quickly ingest. This is particularly useful for enhancing development environments, providing documentation for programming libraries, and offering structured overviews for various domains such as corporate websites, educational institutions, and personal portfolios.

Use your data: Once you've created your llms.txt file, use it in RAG frameworks for knowledge-augmented AI or with MCP servers to enhance your development environment.

The llms.txt file is located at the root path /llms.txt of a website and contains sections in a specific order, including a project name, a summary, detailed information, and file lists with URLs for further details. This format allows LLMs to efficiently access and process the most important information about a website.

Why your website needs an llms.txt file

AI tools now drive a meaningful share of referral traffic for developer and documentation sites, and that share is growing. Here's why adding an llms.txt file matters:

BenefitDescription
Improved AI discoverabilityLLMs can find and understand your content more accurately, similar to how search engines use sitemaps
Better context for AI assistantsCoding assistants and chatbots get structured documentation instead of guessing from scattered pages
Reduced hallucinationsProviding clean, organized content helps AI models give more accurate responses about your product or service
Developer-friendly documentationTechnical content becomes instantly usable in AI-powered development tools and workflows
Future-proof your contentAs AI agents become more common, standardized formats ensure your site remains accessible to new tools

Introducing llms.txt Generator ✨

The llms.txt Generator leverages Firecrawl to crawl your website and extracts data using gpt-4o-mini. You can generate both llms.txt and llms-full.txt files through the web interface or via API.

Accessing llms.txt via API

You can access llms.txt directly by making a GET request to:

https://llmstxt.firecrawl.dev/{YOUR_URL}

For the full version, use:

https://llmstxt.firecrawl.dev/{YOUR_URL}/full

If you have a Firecrawl API key, you can include it to unlock full results and remove limits:

https://llmstxt.firecrawl.dev/{YOUR_URL}?FIRECRAWL_API_KEY=YOUR_API_KEY

For the full version with API key:

https://llmstxt.firecrawl.dev/{YOUR_URL}/full?FIRECRAWL_API_KEY=YOUR_API_KEY

How to generate your llms.txt file

  1. Visit the generator: Go to https://llmstxt.firecrawl.dev.

  2. Enter your website URL: Input the URL of your website.

  3. Generate the file: Click the generate button and wait a few minutes as the tool processes your site.

  4. Download your files: Once ready, download the llms.txt and llms-full.txt files.

Once your file is live, you can check which other sites have adopted the standard at directory.llmstxt.cloud.

Do you need an API key?

While an API key is not required, using a free Firecrawl API key removes any usage limits and provides full access to all features.

How llms.txt compares to robots.txt and sitemap.xml

These three files serve different purposes in making your website accessible, but they often work together. Here's what you need to know:

FilePrimary PurposeTarget AudienceContent Format
robots.txtControls what bots can crawlSearch engines and web crawlersSimple directives (allow/disallow)
sitemap.xmlLists all pages for indexingSearch enginesStructured XML with URLs and metadata
llms.txtProvides context and summariesLarge Language ModelsHuman-readable markdown with descriptions

Key differences to remember:

  • robots.txt restricts access; llms.txt and sitemap.xml provide access
  • sitemap.xml lists every page; llms.txt summarizes key content and structure
  • llms.txt includes natural language descriptions that LLMs can understand, not just URLs
  • You need all three files for complete site discoverability: robots.txt for access control, sitemap.xml for traditional search, llms.txt for AI tools
  • Unlike robots.txt (which bots may ignore), llms.txt is a voluntary standard that AI tools choose to use

Why llms.txt is still important in 2026

llms.txt matters more in 2026 than it did when the standard first emerged. The web has gotten more agentic, RAG has become the default architecture, and AI-first discovery is real. Here's why:

AI agents are everywhere now. What started as chatbots has evolved into autonomous agents that browse, research, and take action across the web. These agents need structured context to work effectively, and llms.txt provides exactly that. Without it, agents waste tokens parsing irrelevant content or, worse, misunderstand your site entirely.

Context windows got bigger, but so did the web. Yes, models can now handle massive inputs. But that doesn't mean they should ingest your entire site. llms.txt gives AI the signal of what actually matters, cutting through noise to deliver the content that defines who you are and what you do.

The RAG ecosystem exploded. Retrieval-augmented generation is now the default architecture for knowledge-intensive applications. llms.txt files slot perfectly into RAG pipelines: they're pre-structured, information-dense, and designed for exactly this use case. If your content isn't in llms.txt format, you're making it harder for developers to build on top of your data.

AI-first discovery is real. Perplexity surpassed 100 million monthly active users in 2025, and ChatGPT Search launched to all users the same year. But it's not just about competing AI tools: Google itself is changing. AI Overviews now reaches 2.5 billion monthly users, AI Mode tops 1 billion, and Google I/O 2026 made clear that the era of the "ten blue links" is over — Search is becoming an AI-powered experience with agents, generative UI, and conversational answers. If your site doesn't have structured content that LLMs can understand, you're invisible to a growing segment of how people find information. Think of llms.txt as SEO for the AI era.

Standards won. The llms.txt format gained adoption because it's simple, human-readable, and solves a real problem. Anthropic, Vercel, Cloudflare, and hundreds of other developer-facing products now ship with llms.txt files — you can browse adopters at directory.llmstxt.cloud. If you're not on board, you're behind.

The bottom line: llms.txt isn't a nice-to-have anymore. It's infrastructure for the AI-native web.

Google Lighthouse now audits for llms.txt

In May 2026, Google added llms.txt to Chrome Lighthouse's new "Agentic Browsing" audit category. That makes it an official readiness check, not a suggestion buried in a blog post.

The Agentic Browsing audits evaluate "how well your site is constructed for machine interaction." Among the checks: WebMCP integration, accessibility tree integrity, layout stability (CLS), and presence of an llms.txt file. Google's documentation explains the reasoning:

Without llms.txt, agents may spend more time crawling the site to understand its high-level structure and primary content.

The audit doesn't produce a traditional 0-100 Lighthouse score. Instead, Google surfaces a fractional pass ratio with pass/fail checks tied to agentic readiness signals.

The tension worth understanding. Google still says llms.txt is not needed for AI search rankings. Its developer documentation explicitly states: "You don't need to create new machine readable files, AI text files, markup, or Markdown to appear in generative AI search." But the Lighthouse check focuses on a different use case: how well your site serves AI agents and browser tools, not how well it ranks in Search.

SEO expert Lily Ray asked Google's John Mueller directly about the tension, noting that Google itself publishes llms.txt files and markdown pages. Mueller drew a useful distinction in his reply: "discovery" (getting found by a search engine) versus "functionality" (helping an agent or user do something once they've arrived).

His conclusion: for developer documentation and technical content, llms.txt is worth adding to help AI coding tools parse and use your content efficiently. For consumer sites like e-commerce, he's direct: "Making a markdown version of a shoe's specs is not going to get you more sales."

What this means for you. The Lighthouse audit is an opt-in readiness check, not a ranking factor. If your site serves developers, hosts technical documentation, or sees meaningful agent traffic, llms.txt is worth adding. If you're running a consumer site, focus on the fundamentals first and treat llms.txt as a low-effort addition once those are covered. Use llmstxt.firecrawl.dev to generate yours in minutes.

Related resources

References

Frequently Asked Questions

Do I need to update my llms.txt file regularly?

Yes, update it when you add major content sections, change your site structure, or launch new features. For dynamic sites, consider regenerating monthly or after significant updates.

Can I manually edit the generated llms.txt file?

Yes. The generator creates a starting point, but you can edit the file to emphasize important sections, remove irrelevant content, or improve descriptions for better AI understanding.

Will having an llms.txt file affect my search engine rankings?

No direct impact on traditional SEO. Google explicitly says you do not need to create llms.txt files to appear in generative AI search. However, Chrome Lighthouse now checks for it as an agentic readiness signal, and as AI-powered discovery grows, structured content may become increasingly valuable for visibility.

How does Firecrawl help with generating llms.txt files?

Firecrawl crawls your website, extracts clean content, and uses gpt-4o-mini to structure it into the llms.txt format. It handles JavaScript-heavy sites and provides both API and web interface access.

Can I use llms.txt for internal documentation or private sites?

Yes. While the standard assumes public access, you can generate llms.txt files for private documentation, internal wikis, or knowledge bases to improve how AI tools interact with your content.

Is llms.txt supported by major AI platforms?

Adoption is growing. Some AI tools and chatbots already look for llms.txt files when accessing websites. Check the directory at directory.llmstxt.cloud to see which platforms have adopted the standard.

Does Google use llms.txt for search rankings?

No. Google's documentation explicitly states you do not need to create llms.txt files to appear in generative AI search results. However, Chrome Lighthouse's Agentic Browsing category now checks for the file as a signal of how well your site is structured for AI agent interactions, not for ranking purposes.

What is Chrome Lighthouse's Agentic Browsing audit?

A new Lighthouse audit category added in May 2026 that evaluates how well a site is structured for machine interaction. It checks for WebMCP integration, accessibility tree integrity, layout stability, and the presence of an llms.txt file. It produces a pass/fail ratio rather than a 0-100 score.

What is the difference between llms.txt and llms-full.txt?

The llms.txt file provides a concise, structured overview of your site with links to key sections. The llms-full.txt file includes the full content of each page, giving AI tools a complete picture of your documentation or site content. Most use cases are well-served by the standard llms.txt file.

What is the llms.txt standard and who created it?

The llms.txt standard is a proposed format for a markdown file placed at the root of a website, designed to help large language models understand your site's content and structure. It was proposed by Jeremy Howard as a machine-readable complement to sitemaps and robots.txt files.