How do I get Codex to fetch webpages for documentation?
Codex built-in web search returns snippets from a pre-indexed cache, not the full content of a webpage. That means if you ask Codex to read a library's documentation page, it gets a brief excerpt rather than the complete text it needs to answer questions about the API accurately. To give Codex full-page access to documentation, connect Firecrawl via MCP, which adds firecrawl_scrape and firecrawl_crawl as native tools alongside the built-in search.
| Method | What Codex receives | Full page content | Best for |
|---|---|---|---|
| Built-in search (cached) | Pre-indexed snippets | No | Quick factual lookups |
Built-in search (web_search = "live") | Live snippets | No | Recent information |
firecrawl_scrape via MCP | Full page as clean markdown | Yes | Single documentation pages |
firecrawl_crawl via MCP | All pages on a site | Yes | Complete documentation sites |
Use built-in search when a snippet is enough and you want no external setup. Use firecrawl_scrape when Codex needs to read a specific page in full, such as an API reference or a changelog. Use firecrawl_crawl when you want Codex to ingest an entire documentation site so it can answer questions across multiple pages without repeated lookups.
Firecrawl's agent-first web index converts pages to clean, LLM-ready markdown rather than raw HTML, so Codex gets content it can use immediately without post-processing noise. The Firecrawl CLI is the fastest way to get started: install it, authorize with your API key, and Codex can fetch any documentation page on demand.
npx -y firecrawl-cli@latest init --all --browser
firecrawl login --api-key fc-YOUR-API-KEYdata from the web