Introducing /parse. Convert PDFs, Word docs, or spreadsheets into clean data for AI agents 5x faster. Try it now →
Web Search in Hermes Agent: What's Built In and How to Use It
placeholderHiba Fathima
Apr 30, 2026
Web Search in Hermes Agent: What's Built In and How to Use It image

Hermes runs on a server, works on a schedule, and delivers results without anyone in the loop. For that to be useful, web access has to actually work — not just return links that the agent then fails to read.

This guide covers the three layers of web access Hermes provides, what each one does, and how to use them for the kinds of tasks the agent is built for.

TL;DR

  • Hermes ships with web_search and web_extract as its two primary web tools, plus a full browser toolset (browser_navigate, browser_snapshot, browser_vision, and seven others) for interactive sessions
  • web_search returns titles, URLs, and descriptions — with Firecrawl as the backend, it reads page content and returns markdown instead of snippets
  • web_extract fetches a specific URL as markdown; without Firecrawl it falls back to plain HTTP and fails on JS-rendered pages
  • Firecrawl is the default backend: set FIRECRAWL_API_KEY in ~/.hermes/.env and both tools route through it automatically
  • Firecrawl and Tavily both support crawling; Parallel and Exa do not — Firecrawl is the default and most complete pipeline
  • Nous Portal subscribers can use the Tool Gateway to skip provider API keys entirely
  • The browser tools handle content that only appears after an interaction — login flows, form submissions, dynamically-rendered tables

For a broader introduction to the agent — installation, memory, and scheduling — see the Hermes Agent guide.

What web tools does Hermes agent ship with?

Hermes organizes web access into three distinct tool categories, each with a specific job.

web_search takes a query string and returns up to 5 results by default (the limit parameter goes up to 100). Each result includes a title, URL, and description. With Firecrawl as the backend, the search endpoint reads each page and returns markdown content alongside the metadata. With other providers, the description field is a standard snippet.

web_extract takes a URL and returns the full page content as markdown. Pages under 5,000 characters come back in full; larger pages are LLM-summarized before being returned. Without a configured backend, it makes a plain HTTP request — which fails on JS-rendered pages or anything behind bot protection. With Firecrawl, it routes through real browser rendering automatically.

Browser tools (browser_navigate, browser_snapshot, browser_vision, and seven others) operate on a live browser session. They're the right choice when the content only appears after an interaction: clicking a button, submitting a form, scrolling through paginated results. Multiple cloud backends are supported — Browserbase, Browser Use, Firecrawl, and Camofox for local anti-detection — each with different tradeoffs on cost, stealth, and setup.

The two most common patterns:

  • Discover then read: web_search finds the relevant pages, web_extract reads one in full
  • Navigate then interact: browser_navigate opens the page, browser_snapshot reads the accessibility tree, follow-up tools click and type
web_searchweb_extract
InputQuery stringURL
OutputTitle, URL, description (markdown content with Firecrawl)Full page markdown; LLM-summarized above 5k chars
JavaScriptHandled by providerPlain HTTP by default; real browser with Firecrawl
Default backendFirecrawlFirecrawl
Best forFinding relevant pagesReading a known URL in full

How do I configure the web backend in Hermes?

Hermes auto-detects which backend to use from whichever API key is in ~/.hermes/.env. With FIRECRAWL_API_KEY set, both web_search and web_extract route through Firecrawl automatically — no config file changes needed.

# ~/.hermes/.env
FIRECRAWL_API_KEY=fc-YOUR-API-KEY

To confirm what's active:

hermes tools

To pin a specific backend instead of relying on auto-detection, run hermes tools, select Web, and pick from the menu. The four supported backends and their capabilities:

BackendSearchExtractCrawlAPI key
Firecrawl (default)FIRECRAWL_API_KEY
TavilyTAVILY_API_KEY
ParallelPARALLEL_API_KEY
ExaEXA_API_KEY

Firecrawl and Tavily both support site-wide crawling. Parallel and Exa don't. All four handle search and single-page extraction.

For a self-hosted Firecrawl instance, set FIRECRAWL_API_URL alongside the API key.

Can I use web search without a separate API key?

Paid Nous Portal subscribers can skip the provider keys entirely. The Tool Gateway routes web_search and web_extract through Firecrawl on your behalf, billed through your subscription. Image generation, TTS, and browser automation are also available through the same gateway.

To enable it, run hermes model, select Nous Portal, and opt in when the gateway prompt appears. You can also toggle it per-tool via hermes tools by selecting "Nous Subscription" as the provider.

Why is Firecrawl the default, and what does it actually change?

With Firecrawl configured, both web tools behave differently in ways that matter for automation.

For web_search: the request hits Firecrawl's search endpoint, which finds pages and reads their content — returning markdown rather than the short descriptions other providers return. For tasks where the agent needs to synthesize information from multiple sources in one shot, this matters: the content is already there without a follow-up extract call per result.

For web_extract: the request routes through Firecrawl's scrape endpoint with real browser rendering. A JS-heavy product page, a Cloudflare-protected site, a page that requires a cookie click before content renders — these all work correctly through Firecrawl and fail without it. The upgrade is automatic once the API key is set.

For crawling: when the agent needs an entire docs site, all blog posts under a domain, or a structured multi-page resource, it can trigger a crawl job in natural language:

"Crawl the docs.example.com/api section and summarize each endpoint"

Firecrawl maps the site and returns every page under the specified path as markdown. Parallel and Exa don't support crawling; Tavily does and can be used as an alternative if needed.

How do I run web search on a schedule?

The most practical advantage of Hermes's web tools isn't a single capability — it's that they run unattended, on a schedule, with no rate limits imposed by the platform.

The built-in cron system lets you set up recurring web tasks from a natural language description:

"Every Monday morning, search for new LLM agent papers published in the past week and send a summary to my Telegram"

The agent creates a cron entry. Every Monday, it runs web_search, compiles the results, and delivers the summary. You don't touch anything between runs.

Some patterns that come up repeatedly:

  • Research digests: search a topic on a schedule, summarize findings, deliver via Telegram or email
  • Change monitoring: extract a pricing or status page weekly, alert when something changes
  • Competitive tracking: crawl a competitor's blog or changelog, surface new content
  • Data collection: search for new listings, filings, or posts; extract structured data; write to a file or database

The only limit is your API budget — Hermes doesn't impose daily run caps.

What if the content is behind a form, login, or dynamic page?

web_search and web_extract cover most research tasks. The browser layer handles the rest: content behind login walls, dynamically-rendered tables, form submissions, paginated results that only appear after a click.

browser_navigate opens a URL in a live browser session and executes JavaScript fully before returning. Use this instead of web_extract when the content you need only appears after the page loads client-side.

browser_snapshot returns the page's accessibility tree — every visible element, link, and input field, each tagged with a ref ID like @e1, @e2. Follow-up browser calls use these IDs to click (browser_click) or type (browser_type). It's a text representation, not a screenshot.

browser_vision takes a screenshot of the current page and analyzes it with the model's vision capability — for pages where the accessibility tree alone doesn't capture what matters: charts, complex table formatting, CAPTCHA challenges, or anything that's meaningful visually but not as plain text.

A typical interaction pattern:

"Go to the SEC EDGAR full-text search, search for filings from [company] in the last 30 days, and extract the filing dates and document types from the results table"

The agent navigates, reads the snapshot to find the search field, types the query, submits, reads the results. Structured data that would never surface through a plain search query.

The full browser toolset includes 10 tools: browser_navigate, browser_snapshot, browser_vision, browser_click, browser_type, browser_scroll, browser_press, browser_back, browser_get_images, and browser_console (for reading JS errors from the live page).

For agents running on a server, switching to a cloud browser backend avoids the need for a local Chromium install. Firecrawl is one option — configured with the same FIRECRAWL_API_KEY already used for web search, via hermes tools → Browser Automation → Firecrawl.

What parameters does web_search support?

query and limit work across all providers. Backend-specific options:

ParameterDescriptionProvider
querySearch query (required)All
limitNumber of results, 1–100 (default 5)All
search_depthbasic or advancedTavily
include_domainsRestrict results to specific domainsFirecrawl, Tavily, Exa
exclude_domainsExclude domains from resultsFirecrawl, Tavily
start_published_dateResults published after this dateExa
end_published_dateResults published before this dateExa
use_autopromptLet Exa rewrite the query for neural searchExa

The query field passes search operators through to the backend when supported — site:, intitle:, filetype:, -term, and "exact phrase" work with Firecrawl and most other backends.

Conclusion

Hermes's web pipeline is designed for agents that run continuously. web_search and web_extract handle the majority of research tasks from one API key, with Firecrawl providing real browser rendering for both. The browser layer handles interactive content. The cron system means any of this runs on a schedule with no involvement between runs.

Setup is minimal: one line in ~/.hermes/.env, and both search and extraction are live.

For the full tool reference, see the Hermes tools docs. For a broader look at what you can build on top — memory, skills, scheduling, and subagents — see the Hermes Agent guide.

Frequently Asked Questions

How does web search work in Hermes agent?

Hermes ships with a web_search tool that sends queries to a configurable backend and returns results as titles, URLs, and descriptions. Firecrawl is the default: when FIRECRAWL_API_KEY is set in ~/.hermes/.env, the agent routes web_search calls through Firecrawl's search endpoint, which finds relevant pages and returns clean markdown. Other supported backends are Tavily, Exa, and Parallel. Firecrawl and Tavily both support crawling; Firecrawl is the only provider that covers search, scraping, and crawling while also being the default.

What is the difference between web_search and web_extract in Hermes?

web_search sends a query string to your configured provider (Firecrawl by default) and returns a ranked list of results. web_extract takes a specific URL and fetches its full content as markdown — and LLM-summarizes pages over 5000 characters before returning them. Use web_search when the agent needs to find the right pages first; use web_extract when you already know the URL. Without Firecrawl, web_extract falls back to plain HTTP and fails on JS-rendered pages. With Firecrawl set as the backend, both tools route through Firecrawl's infrastructure automatically.

Which web search providers does Hermes agent support?

Hermes supports four: Firecrawl, Parallel, Tavily, and Exa. The backend is auto-detected from whichever API key is present in your environment, with Firecrawl as the default. You can pin a provider explicitly by running hermes tools and selecting from the interactive menu, which writes the web: block to config.yaml. Firecrawl and Tavily both support crawling; Parallel and Exa do not. Firecrawl is the default and covers the most ground — search, extract, and crawl from one API key.

Does Hermes agent support browser automation?

Yes. Hermes ships a full browser toolset with 10 tools: browser_navigate, browser_snapshot, browser_click, browser_type, browser_scroll, browser_press, browser_back, browser_get_images, browser_vision, and browser_console. Multiple cloud backends are supported — Browserbase, Browser Use, Firecrawl cloud browsers, and Camofox for local anti-detection browsing. browser_snapshot returns an accessibility tree with ref IDs the agent uses for clicking and typing. browser_vision takes a screenshot and analyzes it with vision AI. Paid Nous Portal subscribers can access browser automation through the Tool Gateway without a separate API key.

What is the Tool Gateway in Hermes?

Tool Gateway is a Nous Portal feature that lets paid subscribers use web search, image generation, TTS, and browser automation without configuring separate provider API keys. For web tools, the gateway routes web_search and web_extract through Firecrawl. To enable it, run hermes model, select Nous Portal, and opt in when prompted — or run hermes tools and select 'Nous Subscription' as the provider for any tool. This is useful for teams that want web access without managing API keys for Firecrawl, Tavily, or Exa separately.

How do I configure Firecrawl as the web backend in Hermes agent?

Set FIRECRAWL_API_KEY in ~/.hermes/.env. Hermes auto-detects it and routes web_search and web_extract through Firecrawl automatically, with no other config change needed. To confirm the active provider, run hermes tools. If you want to pin Firecrawl explicitly, run hermes tools, select Web from the menu, and pick Firecrawl — the wizard writes the provider choice to config.yaml. To point Hermes at a self-hosted Firecrawl instance, set FIRECRAWL_API_URL alongside the API key.

Why does web_extract return empty content for some pages?

Without Firecrawl configured, web_extract falls back to plain HTTP and extracts content from raw HTML. Pages that render client-side via JavaScript return empty or incomplete responses because the content never appears in the initial HTML. Setting FIRECRAWL_API_KEY fixes this: web_extract routes through Firecrawl's scrape endpoint, which uses real browser rendering, and returns actual page content regardless of how the site is built. Pages behind bot protection or Cloudflare challenges also fail plain HTTP but work correctly through Firecrawl.

Can Hermes agent crawl a whole site, not just individual pages?

Yes, when Firecrawl or Tavily is configured as the web backend. Firecrawl and Tavily both support site-wide crawling; Parallel and Exa do not. The agent can trigger a crawl job in natural language and pull content from every page under a domain. For Firecrawl, the same FIRECRAWL_API_KEY that enables search and extract also covers crawl — no separate configuration needed.

How do I set up a scheduled web search task in Hermes agent?

Hermes has a built-in cron system. From a session, describe the schedule you want in natural language: 'Every Monday at 8am, search for the latest AI research papers on [topic] and send a summary to my Telegram.' The agent creates a cron entry and runs the task on that schedule. The web_search step uses whichever provider is configured — Firecrawl by default. There are no daily run limits; the only constraint is your API budget.

FOOTER
The easiest way to extract
data from the web
Backed by
Y Combinator
LinkedinGithubYouTube
SOC II · Type 2
AICPA
SOC 2
X (Twitter)
Discord