Introducing /parse. Convert PDFs, Word docs, or spreadsheets into clean data for AI agents 5x faster. Try it now →

How do I get Codex to fetch webpages for documentation?

Codex built-in web search returns snippets from a pre-indexed cache, not the full content of a webpage. That means if you ask Codex to read a library's documentation page, it gets a brief excerpt rather than the complete text it needs to answer questions about the API accurately. To give Codex full-page access to documentation, connect Firecrawl via MCP, which adds firecrawl_scrape and firecrawl_crawl as native tools alongside the built-in search.

MethodWhat Codex receivesFull page contentBest for
Built-in search (cached)Pre-indexed snippetsNoQuick factual lookups
Built-in search (web_search = "live")Live snippetsNoRecent information
firecrawl_scrape via MCPFull page as clean markdownYesSingle documentation pages
firecrawl_crawl via MCPAll pages on a siteYesComplete documentation sites

Use built-in search when a snippet is enough and you want no external setup. Use firecrawl_scrape when Codex needs to read a specific page in full, such as an API reference or a changelog. Use firecrawl_crawl when you want Codex to ingest an entire documentation site so it can answer questions across multiple pages without repeated lookups.

Firecrawl's agent-first web index converts pages to clean, LLM-ready markdown rather than raw HTML, so Codex gets content it can use immediately without post-processing noise. The Firecrawl CLI is the fastest way to get started: install it, authorize with your API key, and Codex can fetch any documentation page on demand.

npx -y firecrawl-cli@latest init --all --browser
firecrawl login --api-key fc-YOUR-API-KEY
Last updated: May 06, 2026
FOOTER
The easiest way to extract
data from the web
Backed by
Y Combinator
LinkedinGithubYouTube
SOC II · Type 2
AICPA
SOC 2
X (Twitter)
Discord