What is neural search?
Neural search encodes queries and documents using transformer-based neural networks, producing dense vector representations that capture meaning rather than vocabulary. The retrieval step compares query vectors against pre-indexed document vectors using approximate nearest-neighbor search, returning documents most similar in meaning regardless of word overlap. Neural search is a specific implementation of semantic search: all neural search is semantic, but not all semantic search uses neural encoders. The practical distinction matters when choosing between a bi-encoder model that pre-encodes documents for fast lookup versus a cross-encoder that rescores candidate pairs at higher accuracy but higher compute cost.
| Component | Bi-encoder (neural) | Cross-encoder (neural reranking) | BM25 (sparse) |
|---|---|---|---|
| Encoding | Query and doc separately | Query and doc together | Term frequency statistics |
| Speed | Fast (ANN index) | Slow (per-pair inference) | Fast |
| Accuracy | High | Highest, used for reranking | Moderate |
| Index size | Large (vector store) | None (online inference) | Compact (inverted index) |
Neural search makes sense when query and document vocabulary diverge, when users write in conversational language rather than controlled terminology, or when building a retrieval layer for an LLM pipeline where RAG grounding requires conceptual matching over large document sets. For short, precise lookups such as part numbers or code identifiers, keyword search is faster and more accurate.
Firecrawl's Search API retrieves and returns clean page content. Feed that content through an embedding model and store the vectors in a vector database to build a neural search layer over live web data: Firecrawl handles crawling and cleaning, your model handles encoding.
data from the web