Introducing Firecrawl v2.5 - The world's best web data API. Read the blog.
Retell’s AI phone agents get LLM-ready content from Firecrawl
placeholderEric Ciarla
Nov 30, 2025
Retell’s AI phone agents get LLM-ready content from Firecrawl image

When a borrower checks application status or a lead answers an outbound call, Retell’s AI phone agents have seconds to apply the latest rules. Retell is the leading voice agent platform powering fully autonomous phone calls for modern businesses, built to sound like real conversations, not touch-tone phone menus. These agents sit inside revenue and ops workflows, so sounding human is not enough; they have to answer from live help-center content, policies, eligibility criteria, and technical docs. Retell ships with a built-in Knowledge Base that ingests URLs, files, and custom text so agents use the same sources human teams trust. As larger customers came on with multi-product documentation and JavaScript-heavy portals, each new account meant another fragile scraper and another round of “did we miss this page?” debugging.

To keep those knowledge bases current, the team was maintaining a mix of one-off Puppeteer/Playwright scripts, sitemap crawls, and manual copy-paste from docs sites. They could wire bespoke scrapers for each account, rely on occasional exports and sitemaps, or accept that some calls would route through outdated answers. These setups shipped, but they did not meet the bar Retell sets for call containment and cost per call. What they wanted instead was a straightforward pipeline from documentation to LLM-ready content and into a knowledge base they could treat as configuration, not as a new engineering project every time a customer signed. Firecrawl’s web scraping API, already powering more than 40,000 knowledge bases, now fills that gap, turning each customer’s documentation into LLM-ready content their agents can use with confidence.

Retell now standardizes on Firecrawl’s web scraping API, specifically the /scrape endpoint, as the ingestion primitive behind these knowledge bases. Firecrawl’s web context engine takes a docs URL—an API reference, changelog, status page, or support center—and turns it into LLM-ready markdown or JSON, stripping navigation, boilerplate, and ad-like clutter. Backed by Firecrawl’s Fire-Engine browser layer, it also handles JavaScript-heavy docs and PDFs without Retell running its own headless browser fleet or proxy rotation. Instead of writing a new scraper for every customer, the team defines small, focused scrape jobs around the URLs that actually matter and lets Firecrawl handle pagination, navigation, and rendering. The output drops into the embedding and retrieval stack that already powers their agents, so extending coverage is usually a matter of adding a URL to a Firecrawl job, not re-architecting ingestion.

As they rolled Firecrawl into production, Retell experimented with different endpoints to match their knowledge base architecture. They used /map to understand docs structures and /scrape to hydrate the content itself, and found that leading with targeted /scrape jobs on a short list of each customer’s docs and help center links gave them the predictable coverage they wanted for voice agents. Together with the Firecrawl team, they documented a straightforward “docs → scrape → knowledge base” pattern, plus clear guidance on where /batch-scrape, /crawl, and /map make sense for larger, multi-domain documentation sets.

The result is a scrape-first integration pattern that keeps pace with Retell’s pipeline. For each new customer, Retell keeps a short list of that customer’s docs and help center links as configuration for Firecrawl jobs instead of hard-coding them into bespoke scripts. When a job works well for one account, for example, scraping a multi-language API reference plus changelog, it becomes a template Retell can reuse across that vertical. Refreshing a knowledge base means rerunning Firecrawl jobs or adjusting their config, not touching scraping infrastructure. Today, Retell’s production phone agents run on top of Firecrawl-powered knowledge bases that stay aligned with each customer’s docs without a scraping team in the loop. Any team building AI assistants or knowledge bases can follow the same approach: point Firecrawl at a short list of docs and help center links, feed the LLM-ready output into their retrieval stack, and ship agents that answer from live documentation, not stale exports.

FAQs
FOOTER
The easiest way to extract
data from the web
. . .. ..+ .:. .. .. .:: +.. ..: :. .:..::. .. .. .--:::. .. ... .:. .. .. .:+=-::.:. . ...-.::. .. ::.... .:--+::..: ......:+....:. :.. .. ....... ::-=:::: ..:-:-...: .--..:: ......... .. . . . ..::-:-.. .-+-:::.. ...::::. .: ...::.:.. . -... ....: . . .--=+-::. :-=-:.... . .:..:: .:---:::::-::.... ..::........::=..... ...:-.. .:-=--+=-:. ..--:..=::.... . .:.. ..:---::::---=:::..:... ..........::::.:::::::-::.-.. ...::--==:. ..-::-+==-:... .-::....... ..--:. ..:=+==.---=-+-:::::::-.. . .....::......:: ::::-::.---=+-:..::-+==++X=-:. ..:-::-=-== ---.. .:.--::.. .:-==::=--X==-----====--::+:::+... ..-....-:..::-::=-=-:-::--===++=-==-----== X+=-:.::-==----+==+XX+=-::.:+--==--::. .:-+X=----+X=-=------===--::-:...:. .... ....::::...:-:-==+++=++==+++XX++==++--+-+==++++=-===+=---:-==+X:XXX+=-:-=-==++=-:. .:-=+=- -=X+X+===+---==--==--:..::...+....+ ..:::---.::.---=+==XXXXXXXX+XX++==++===--+===:+X+====+=--::--=+XXXXXXX+==++==+XX+=: ::::--=+++X++X+XXXX+=----==++.+=--::+::::+. ::.=... .:::-==-------=X+++XXXXXXXXXXX++==++.==-==-:-==+X++==+=-=--=++++X++:X:X+++X+-+X X+=---=-==+=+++XXXXX+XX=+=--=X++XXX==---::-+-::::.:..-..
Backed by
Y Combinator
LinkedinGithubYouTube
SOC II · Type 2
AICPA
SOC 2
X (Twitter)
Discord