Introducing Authenticated Scraping

April 14, 2025

•

Eric Ciarla imageEric Ciarla

Introducing Change Tracking: Launch Week III - Day 1

Introducing Change Tracking: Launch Week III - Day 1 image

Welcome to Launch Week III, Day 1! Today we’re excited to announce Change Tracking — an enhanced Firecrawl feature that automatically detects and details changes on websites, now available in beta for all users.

What is Change Tracking?

Change Tracking Visualization

Change Tracking allows you to monitor website changes by comparing the current scrapes and crawls to previous versions, clearly indicating if content is new, unchanged, modified, or removed.

Each Change Tracking response includes:

FieldDescription
previousScrapeAtTimestamp of the last scrape (or null if no previous scrape)
changeStatusnew, same, changed, or removed
visibilityvisible (found through crawling) or hidden (found via memory)
diff (optional)Git-style diff of changes (when enabled)
json (optional)Structured JSON comparison of specific fields (when enabled)

Simple Integration

Firecrawl’s Change Tracking feature integrates effortlessly into your existing workflows with two simple request methods—scrape and crawl. You must specify the markdown format in addition to changeTracking:

Scrape Request Example:

const scrapeResponse = await app.scrapeUrl('https://firecrawl.dev', {
  formats: ['markdown', 'changeTracking']
});
console.log(scrapeResponse);

Scrape Response:

{
  "url": "https://firecrawl.dev",
  "markdown": "# AI Agents for great customer experiences\n\nChatbots that delight your users...",
  "changeTracking": {
    "previousScrapeAt": "2025-04-10T12:00:00Z",
    "changeStatus": "changed",
    "visibility": "visible"
  }
}

Crawl Request Example:

const crawlResponse = await app.crawlUrl('https://firecrawl.dev', {
  scrapeOptions: { formats: ['markdown', 'changeTracking'] }
});
console.log(crawlResponse);

Crawl Response:

{
  "success": true,
  "status": "completed",
  "completed": 2,
  "total": 2,
  "creditsUsed": 2,
  "expiresAt": "2025-04-14T18:44:13.000Z",
  "data": [
    {
      "markdown": "# Turn websites into LLM-ready data\n\nPower your AI apps with clean data crawled from any website...",
      "metadata": {},
      "changeTracking": {
        "previousScrapeAt": "2025-04-10T12:00:00Z",
        "changeStatus": "changed",
        "visibility": "visible"
      }
    },
    {
      "markdown": "## Flexible Pricing\n\nStart for free, then scale as you grow...",
      "metadata": {},
      "changeTracking": {
        "previousScrapeAt": "2025-04-10T12:00:00Z",
        "changeStatus": "changed",
        "visibility": "visible"
      }
    }
  ]
}

Advanced Change Tracking Modes

Change Tracking supports multiple advanced modes to suit different monitoring needs:

  • Git-Diff Mode: Provides detailed, Git-style line-by-line diffs, perfect for content updates and edits.
  • JSON Mode: Offers structured comparisons using a custom schema to track specific data changes, ideal for monitoring product details, pricing, or key text changes.

Advanced Change Tracking Request Example:

const result = await app.scrapeUrl("http://www.whattimeisit.com", {
  formats: ["markdown", "changeTracking"],
  changeTrackingOptions: {
    modes: ["git-diff", "json"], // Enable specific change tracking modes
    schema: {
      type: "object",
      properties: {
        time: { type: "string" },
      },
    }, // Schema for structured JSON comparison
    prompt: "Get the time", // Optional custom prompt
  },
});

// Access git-diff format changes
if (result.changeTracking.diff) {
  console.log(result.changeTracking.diff.text); // Git-style diff text
  console.log(result.changeTracking.diff.json); // Structured diff data
}

// Access JSON comparison changes
if (result.changeTracking.json) {
  console.log(result.changeTracking.json); // Previous and current values
}

Git-Diff Results Example:

 **April, 13 2025**
 
-**05:55:05 PM**
+**05:58:57 PM**

...

JSON Comparison Results Example:

{
  "time": { 
    "previous": "2025-04-13T17:54:32Z", 
    "current": "2025-04-13T17:55:05Z" 
  }
}

How Change Tracking Works

When enabled, Firecrawl compares current scrapes against previous versions based on URL, team ID, and markdown format:

  • Comparison is resilient to whitespace and content order changes.
  • Iframe source URLs are ignored to avoid false positives caused by captchas or antibots.

Important Considerations and Limitations

  • URL Consistency: Ensure URLs match exactly for accurate comparisons.
  • Scrape Option Consistency: Variations in scrape options can affect consistency.
  • Team Scoping: Tracking is scoped per team; initial scrapes always show as new.
  • Beta Monitoring: Watch the warning field and handle missing changeTracking objects due to potential database timeouts.

Pricing

  • Basic tracking and Git-diff mode: Free
  • JSON mode: 5 credits per page scrape due to additional processing requirements.

Get Started Today

Change Tracking is live in beta for all users:

  1. Try it now: Add changeTracking to your scrape or crawl formats.
  2. Learn more: Read the docs for /scrape and the docs for /crawl.
  3. Get help: Join our Discord community or contact help@firecrawl.com.

Ready to track detailed content changes? Sign up for Firecrawl and start today.

Ready to Build?

Start scraping web data for your AI apps today.
No credit card needed.

About the Author

Eric Ciarla image
Eric Ciarla@ericciarla

Eric Ciarla is the Chief Operating Officer (COO) of Firecrawl and leads marketing. He also worked on Mendable.ai and sold it to companies like Snapchat, Coinbase, and MongoDB. Previously worked at Ford and Fracta as a Data Scientist. Eric also co-founded SideGuide, a tool for learning code within VS Code with 50,000 users.

More articles by Eric Ciarla

How to Create an llms.txt File for Any Website

Learn how to generate an llms.txt file for any website using the llms.txt Generator and Firecrawl.

Announcing Firestarter, our open source tool that turns any website into a chatbot

Spin up a fully functional RAG chatbot from any website URL using Firecrawl and Upstash—clean markdown in, OpenAI-compatible API out, all in under a minute.

Building Fire Enrich, our open source data enrichment tool

See how we built Fire Enrich, an open source tool that uses Firecrawl, OpenAI, and a multi-agent system to automate data enrichment — fully transparent, extensible, and built for developers.

Cloudflare Error 1015: How to solve it?

Cloudflare Error 1015 is a rate limiting error that occurs when Cloudflare detects that you are exceeding the request limit set by the website owner.

Build an agent that checks for website contradictions

Using Firecrawl and Claude to scrape your website's data and look for contradictions.

Why Companies Need a Data Strategy for Generative AI

Learn why a well-defined data strategy is essential for building robust, production-ready generative AI systems, and discover practical steps for curation, maintenance, and integration.

Getting Started with OpenAI's Predicted Outputs for Faster LLM Responses

A guide to leveraging Predicted Outputs to speed up LLM tasks with GPT-4o models.

How to easily install requests with pip and python

A tutorial on installing the requests library in Python using various methods, with usage examples and troubleshooting tips