What is crawl depth limit?

Crawl depth limit caps how far a crawler will venture from its starting URL before it stops following new links. Most developers think of depth as the number of link hops from the seed URL (the homepage is depth 0, pages linked from it are depth 1, and so on). Some crawlers, however, measure depth by discovery order rather than link distance: a page discovered directly from a sitemap counts as depth 0 even if it is five clicks deep in the site's navigation.

Depth model	How depth is counted	Implication
Link-hop depth	Clicks from seed URL	Deeply nested pages need a high limit
Discovery depth	Order in which URLs are found	Sitemapped pages are depth 0 regardless of nesting
Path depth	Segments in the URL path	Does not reflect link structure
No limit	Follows all reachable links	Can run indefinitely on large sites

Setting a depth limit prevents a crawl from running indefinitely on large or deeply nested sites, at the cost of missing content beyond that boundary. A limit of 2 using link-hop depth captures the homepage and two levels of navigation, which covers most documentation sites; the same limit using discovery depth would still include sitemapped pages regardless of where they sit in the URL hierarchy. The right value depends on where the content you need actually lives. Too shallow and you miss deep but valuable pages; too deep and the crawl ballooms in size and cost.

In Firecrawl's Crawl API, the maxDiscoveryDepth parameter uses discovery order: the starting URL and any pages found in the site's sitemap have a discovery depth of 0, and each subsequent layer of links increments the depth by 1. This means maxDiscoveryDepth: 1 will still return sitemapped pages even if they are nested several directories deep, because they were discovered directly rather than through link traversal.

Ready to build?

All Questions