
TL;DR: Best News APIs
| API | What it does |
|---|---|
| Firecrawl | Agent-ready, real-time news search with full article content in one call |
| NewsAPI.ai | 150k+ sources, enriched metadata, archive back to 2014 |
| Exa | Semantic news search built for AI agent workflows |
| ScrapingBee | Google News SERP scraping with JS rendering and geotargeting |
Whether you are building a research agent, a financial monitoring app, or a product that surfaces relevant headlines to users, you will hit the same wall quickly: news content is hard to fetch reliably at runtime. AI agents depend on live web context to ground their responses, and news is the most time-sensitive layer of that context. A good news API handles that infrastructure so your app or agent can stay focused on reasoning, not fetching.
The API you pick will shape what your product can actually do. Some services return headlines and URLs. Others return full article content, entity tags, and sentiment scores. For agents specifically, the difference between getting a URL and getting full readable markdown is the difference between one LLM call and three. The infrastructure decisions underneath (source coverage, freshness guarantees, output format) matter more than most comparisons let on.
These are the best news APIs I have tested and would recommend to a developer building apps or agents today, ordered by how I think about them for different use cases.
What is a news API?
A news API is a service that exposes news content as structured data over HTTP. Instead of scraping individual news sites, you send a query and get back articles with titles, URLs, publication dates, and optionally full text.
There are broadly two types:
- Search-based APIs let you query for news by keyword, topic, or semantic meaning. You control freshness through parameters. Firecrawl and Exa fall in this category.
- Index-based APIs continuously crawl and index news sources, then let you query that index. NewsAPI.ai and ScrapingBee's Google News scraper lean this way.
The distinction matters when you are building real-time pipelines (search-based tends to be fresher and more targeted) versus broad coverage monitoring (index-based gives you breadth and enriched metadata at scale).
For agents, there is a third dimension to consider: output format. An agent does not click links — it needs the article text in the same response as the search results. APIs that return only URLs require the agent to make a second round of fetch calls, which adds latency, increases cost, and introduces another failure mode. This is especially important in deep research for AI agents, where multi-step workflows compound the cost of every extra fetch. When evaluating any news API for agent use, always check whether it can return full content inline.
1. Firecrawl
Firecrawl is a web intelligence API with a dedicated news search mode that returns full article content alongside structured results in a single call.
Firecrawl is open source with over 100k GitHub stars and is used by over 1 million developers, which means you can inspect exactly how it works and self-host if needed. Most news APIs make you paginate twice: once to get URLs, once to fetch content. That two-step pattern is fine for dashboards, but it breaks agent workflows. Firecrawl's /search endpoint with sources: ["news"] returns both in one request. You get the title, URL, description, and optionally the full markdown of each article from a live crawl in a single API call. That makes it the most agent-friendly option in this list: your agent gets content it can reason over immediately, with no extra fetch step. And if the news you need is behind a dynamic interaction — a scroll-to-load feed, a tab click, an expandable section — Firecrawl's /interact endpoint handles that too, which no other API in this list supports.
sources: ["news"]: filters results to news-specific sources; combine with"web"if you want mixed resultsscrapeOptions: add this to the search call to get full markdown, HTML, links, or screenshots per result with no second requesttbsparameter: time-based filtering with values likeqdr:d(past 24 hours),qdr:w(past week) — applies towebsource results; for news freshness, use thelimitparameter and rely on native news ranking- Location filtering: pass a
locationparameter to get geographically relevant news results limit: controls results per source type; costs 2 credits per 10 results, so budget predictably/interactendpoint: if the news you need is behind a click, scroll, or dynamically loaded section, Firecrawl can interact with the page to reach it — this is likely the only API in this list with that capability
Install:
# JS SDK
npm install @mendable/firecrawl-js
# CLI (for agents)
npx -y firecrawl-cli@latest init --all --browserExample:
import FirecrawlApp from "@mendable/firecrawl-js";
const app = new FirecrawlApp({ apiKey: "YOUR_API_KEY" });
const results = await app.search("AI funding announcements", {
sources: ["news"],
limit: 10,
scrapeOptions: { formats: ["markdown"] },
});Honest take: Firecrawl is my first recommendation for anyone building agents or apps that need news content at runtime. You skip the second fetch entirely, the credit model is predictable (2 credits per 10 results, plus 1 credit per page scraped), and the markdown output drops straight into a prompt with no cleanup. The main caveat is coverage: it searches the live web, so niche trade publications may not surface as reliably as an indexed service like NewsAPI.ai would.
Cons: Not designed for broad topic monitoring across thousands of keywords. Better suited for targeted queries than always-on crawls.
If you want to use Firecrawl directly inside your agent, the Firecrawl agent skill gives your agent one-command access to search, scrape, crawl, and browser automation. For a deeper look at query patterns and parameter options, the Firecrawl search endpoint guide covers agentic workflows in detail. Full reference at docs.firecrawl.dev/features/search.
2. NewsAPI.ai
NewsAPI.ai is a news intelligence platform that indexes 150,000+ publishers across 60+ languages with enriched metadata, entity recognition, and a historical archive back to 2014.
Formerly known as Event Registry, NewsAPI.ai sits closer to a news database than a search API. It continuously ingests articles and enriches them with entity tags, topic categories (5,000+ topics), sentiment scores, social sharing data, duplicate detection, and publisher rankings. The clients list includes Spotify, Bloomberg, IBM, and Accenture, which tells you something about the scale it operates at.
150,000+ publishers: one of the largest news source networks available in any API60+ languages: significantly broader than most competitors which default to English-firstEntity recognition: articles tagged with people, organizations, and locations automaticallyEvent clustering: groups related articles about the same story, useful for deduplicationSentiment analysis: per-article polarity scores for financial and brand monitoringHistorical archive to 2014: rare capability; most APIs have shallow archives of a few months
Install:
pip install eventregistryExample:
from eventregistry import *
er = EventRegistry(apiKey="YOUR_API_KEY")
q = QueryArticlesIter(
keywords="artificial intelligence",
dateStart="2026-04-01",
dateEnd="2026-04-24",
lang="eng"
)
for art in q.execQuery(er, sortBy="date", maxItems=20):
print(art["title"], art["url"])Honest take: If you need enriched metadata (entities, sentiment, topics) alongside raw content, nothing else in this list competes. The archive depth is also genuinely rare. The tradeoff is that the API is more complex to query and the pricing is opaque on the public site; you will need to contact them for a quote once you have a usage estimate.
Cons: JavaScript-dependent website makes it hard to evaluate pricing without signing up. The SDK feels older than the REST-native approaches in this list. Overkill if you just need a few hundred headlines a day.
Full reference at newsapi.ai.
3. Exa
Exa is a semantic search API with a dedicated news vertical designed for financial research, competitive intelligence, and AI agent workflows.
Exa indexes the web using neural embeddings rather than keyword matching, which means you can write queries like "biotech funding announcements in Southeast Asia Q1 2026" and get semantically relevant results instead of exact-match noise. That matters a lot for agents: when an agent is given a broad research task, it often does not know the exact keywords to search for. Exa's semantic layer lets agents express intent and get relevant results, rather than having to construct precise keyword queries. The news vertical pulls from major publications, trade press, and specialized outlets with continuous indexing.
type: "auto": lets Exa decide whether to use neural or keyword search based on your query- Native date filtering: clean start/end date parameters on every request, no time string parsing
highlights: returns relevant passages from each article, useful for quick LLM summarizationcontents.text: returns full article text when you need it, separate from the search results- Free tier: 1,000 requests per month at no cost, with an API Playground for testing
Install:
npm install exa-jsExample:
import Exa from "exa-js";
const exa = new Exa("YOUR_API_KEY");
const results = await exa.search("semiconductor shortage automotive industry", {
type: "auto",
category: "news",
numResults: 10,
startPublishedDate: "2026-04-01",
contents: { text: true, highlights: true },
});Honest take: Exa's semantic approach genuinely outperforms keyword APIs on nuanced queries where you want conceptual matches. The free tier is generous enough for prototyping. The main friction point is cost at scale: $7 per 1,000 searches adds up fast if you are running high-volume monitoring, and each additional result beyond 10 adds another $1 per 1,000 requests.
Cons: More expensive than Firecrawl at scale for simple queries. No built-in sentiment analysis or entity recognition. Coverage skews toward major English-language outlets.
Full reference at exa.ai/docs/reference/verticals/news.
4. ScrapingBee News Results API
ScrapingBee is a web scraping API with a dedicated Google News scraper that handles geotargeting and AI-powered data extraction out of the box.
ScrapingBee's approach to news is different from the others here: it scrapes Google News SERPs directly, which means you get the same results a user would see in Google News for any query, including local and regional stories that smaller indexed services miss. The Google News integration gives you access to Google's editorial ranking for free.
Google News SERP scraping: gets results from Google News directly with all its freshness and ranking signalsAI data extraction: parse specific fields from news pages using AI-powered rules, no HTML parsingGeotargeting: access location-specific news editions, useful for regional monitoringScreenshot capture: useful for visual verification of what was actually served- Free trial: 1,000 credits for new users, no card required
Install:
pip install scrapingbeeExample:
from scrapingbee import ScrapingBeeClient
client = ScrapingBeeClient(api_key="YOUR_API_KEY")
response = client.get(
"https://news.google.com/search?q=AI+regulation&hl=en-US&gl=US",
params={
"render_js": True,
"extract_rules": {"headlines": "h3 a"},
},
)
print(response.text)Honest take: If Google News ranking is your source of truth, ScrapingBee is the most direct path to it. The 99.9% success rate claim holds up in practice for standard news queries. The caveat is that you are working with SERP data, not a news API with structured fields; you will need to do more parsing work unless you use the AI extraction rules.
Cons: More expensive than direct APIs at scale (plans start at $49/month for 250,000 credits). Returns SERP markup rather than structured article objects. Dependent on Google not changing its News layout.
Full reference at scrapingbee.com/scrapers/news-results-api.
Building the top news APIs into your workflow
The combination that works depends on what you are building. For most apps and agents, Firecrawl is the complete solution: it covers search to find news, extract to get full article content, and interact to reach content behind dynamic page behavior like scroll-to-load feeds or tabbed sections. No other API in this list covers all three. For semantically complex research queries where keyword matching is not enough, pairing Firecrawl with Exa gives your agent both breadth and nuance. If you are also comparing broader AI search engines for agents beyond news specifically, that post covers the full landscape including Tavily and Perplexity.
For enterprise media monitoring at scale, NewsAPI.ai is the right anchor. Its entity recognition and event clustering handle the deduplication problem that kills naive monitoring pipelines when a story breaks across 500 outlets simultaneously.
The biggest mistake I see is picking an API based on price alone. A cheaper API that returns headlines without content forces you to write and maintain per-domain scrapers for every source that matters to you. For apps, that is extra engineering. For agents, it is often a dead end: URLs without content simply mean gaps in your agent's knowledge.
For more on building with search APIs, the post on best deep research APIs for agentic workflows covers how to combine search and extraction into research pipelines. If you are specifically evaluating Exa alternatives, this comparison post covers the tradeoffs in more detail. For MCP-based setups, best web search MCP servers covers how to wire news search directly into Claude and other agents.
Frequently Asked Questions
What is a news API?
A news API is a service that lets developers programmatically fetch news articles, headlines, and metadata from across the web. They typically return structured data like title, source, publication date, and article content, making it easier to build news monitoring tools, research pipelines, and AI agents without scraping sites manually.
What is the best news API for real-time results?
For real-time news with full article content, Firecrawl's search API and Exa's news vertical are both strong choices. Firecrawl returns live results with full markdown content in a single call. Exa uses semantic search for more nuanced queries. NewsAPI.ai is best when you need historical archives alongside real-time coverage.
Are news APIs free?
Most news APIs offer a free tier or trial credits. Firecrawl offers 500 free credits (no card required), Exa gives 1,000 free requests per month, ScrapingBee gives 1,000 free credits, and NewsAPI.ai has a free sandbox. Paid plans start from $19/month (Firecrawl) to $49/month (ScrapingBee).
Can I use a news API inside an AI agent?
Yes. Firecrawl and Exa are both designed with agent use cases in mind and return clean, structured data that LLMs can consume directly. Firecrawl's search endpoint returns markdown content alongside results, which is particularly useful for RAG pipelines and research agents.
How do news APIs differ from web scraping?
News APIs handle the infrastructure of finding, fetching, and parsing news articles so you don't have to build and maintain that pipeline yourself. They manage source discovery, freshness, rate limiting, and structured output. Web scraping gives you more flexibility but requires significantly more maintenance.
What is NewsAPI.ai?
NewsAPI.ai (formerly Event Registry) is a news intelligence platform that indexes 150,000+ publishers in 60+ languages and offers entity recognition, sentiment analysis, event clustering, and a historical archive back to 2014. It is best suited for enterprise media monitoring and research workflows that need enriched metadata alongside raw content.
Can Firecrawl extract news articles from the web?
Yes. Firecrawl's search endpoint supports a news source type that returns news-focused results with full article content in a single API call. You can pass sources: ["news"] to filter results to news publishers, and add scrapeOptions to get the full markdown of each article alongside the search results. This makes it well suited for agents and apps that need to read and reason over news content, not just retrieve links.
Do news APIs support filtering by date or topic?
Yes. All the APIs in this list support date or time-based filtering. Exa supports native start/end date parameters on every request. NewsAPI.ai supports filtering by topic, entity, sentiment, and source alongside date ranges. Firecrawl's tbs parameter applies to web results; for news results, freshness is controlled via the limit parameter and native news ranking.

data from the web