Using OpenAI's Realtime API and Firecrawl to Talk with Any Website
Interacting with any website through a conversational agent in real time is now possible thanks to OpenAIâs new Realtime API and Firecrawl. This powerful combination allows developers to build low-latency, multi-modal conversational experiences that can fetch and interact with live web content on the fly.
In this tutorial, weâll guide you through the process of integrating Firecrawlâs scraping and mapping tools into the OpenAI Realtime API Console Demo. By the end, youâll have a real-time conversational agent capable of talking with any website.
Prerequisites
Before you begin, make sure you have the following:
- Node.js and npm installed on your machine.
- An OpenAI API key with access to the Realtime API.
- A Firecrawl API key.
- Basic understanding of React and TypeScript.
Step 1: Clone the OpenAI Realtime API Console Demo
First, clone the repository that contains the OpenAI Realtime API Console Demo integrated with Firecrawl.
git clone https://github.com/nickscamara/firecrawl-openai-realtime.git
cd firecrawl-openai-realtime
Step 2: Install Dependencies
Install the required npm packages:
npm install
Step 3: Set Up Environment Variables
Create a .env
file in the root directory and add your OpenAI and Firecrawl API keys:
OPENAI_API_KEY=your-openai-api-key
FIRECRAWL_API_KEY=your-firecrawl-api-key
If youâre running a local relay server, set the relay server URL:
REACT_APP_LOCAL_RELAY_SERVER_URL=http://localhost:8081
Step 4: Integrate Firecrawl Tools into the Realtime API Console Demo
Open the ConsolePage.tsx
file located at src/pages/ConsolePage.tsx
.
Import Firecrawl
At the top of the file, import the Firecrawl SDK:
import FirecrawlApp from '@mendable/firecrawl-js';
Add the âscrape_dataâ Tool
Within the useEffect
hook where tools are added to the client, add the scrape_data
tool:
client.addTool(
{
name: 'scrape_data',
description: 'Goes to or scrapes data from a given URL using Firecrawl.',
parameters: {
type: 'object',
properties: {
url: {
type: 'string',
description: 'URL to scrape data from',
},
},
required: ['url'],
},
},
async ({ url }: { url: string }) => {
const firecrawl = new FirecrawlApp({
apiKey: process.env.FIRECRAWL_API_KEY || '',
});
const data = await firecrawl.scrapeUrl(url, {
formats: ['markdown', 'screenshot'],
});
if (!data.success) {
return 'Failed to scrape data from the given URL.';
}
setScreenshot(data.screenshot || '');
return data.markdown;
}
);
This tool allows the assistant to scrape data from any URL using Firecrawl.
Add the âmap_websiteâ Tool
Next, add the map_website
tool to enable searching for pages with specific keywords on a website:
client.addTool(
{
name: 'map_website',
description: 'Searches a website for pages containing specific keywords using Firecrawl.',
parameters: {
type: 'object',
properties: {
url: {
type: 'string',
description: 'URL of the website to search',
},
search: {
type: 'string',
description: 'Keywords to search for (2-3 max)',
},
},
required: ['url', 'search'],
},
},
async ({ url, search }: { url: string; search: string }) => {
const firecrawl = new FirecrawlApp({
apiKey: process.env.FIRECRAWL_API_KEY || '',
});
const mapData = await firecrawl.mapUrl(url, { search });
if (!mapData.success || !mapData.links?.length) {
return 'No pages found with the specified keywords.';
}
const topLink = mapData.links[0];
const scrapeData = await firecrawl.scrapeUrl(topLink, {
formats: ['markdown', 'screenshot'],
});
if (!scrapeData.success) {
return 'Failed to retrieve data from the found page.';
}
setScreenshot(scrapeData.screenshot || '');
return scrapeData.markdown;
}
);
This tool allows the assistant to search a website for specific content and retrieve it.
Manage Screenshot State
At the top of your ConsolePage
component, add state management for the screenshot:
const [screenshot, setScreenshot] = useState<string>('');
Display the Screenshot in the UI
In the UI, display the screenshot by adding the following within the appropriate JSX:
{ screenshot && <img src={screenshot} alt="Website Screenshot" /> }
Step 5: Run the Application
In a new terminal window, start the React application:
npm start
Open your browser and navigate to http://localhost:3000
to interact with your real-time conversational agent.
Testing the Agent
Now, you can test your agent by initiating a conversation. For example, ask:
User: âCan you get the latest blog post from https://mendable.ai?â
The assistant will use the scrape_data
tool to fetch content from the specified URL and present it to you.
Conclusion
By integrating Firecrawlâs scraping and mapping tools into the OpenAI Realtime API Console Demo, youâve created a powerful conversational agent capable of interacting with any website in real time. This setup opens up endless possibilities for building advanced AI applications that can access and process live web content on demand.
References
On this page
Prerequisites
Step 1: Clone the OpenAI Realtime API Console Demo
Step 2: Install Dependencies
Step 3: Set Up Environment Variables
Step 4: Integrate Firecrawl Tools into the Realtime API Console Demo
Import Firecrawl
Add the 'scrape_data' Tool
Add the 'map_website' Tool
Manage Screenshot State
Display the Screenshot in the UI
Step 5: Run the Application
Testing the Agent
Conclusion
References
Ready to Build?
Start scraping web data for your AI apps today.
No credit card needed.
About the Author
Nicolas Camara is the Chief Technology Officer (CTO) at Firecrawl. He previously built and scaled Mendable, one of the pioneering "chat with your documents" apps, which had major Fortune 500 customers like Snapchat, Coinbase, and MongoDB. Prior to that, Nicolas built SideGuide, the first code-learning tool inside VS Code, and grew a community of 50,000 users. Nicolas studied Computer Science and has over 10 years of experience in building software.
More articles by Nicolas Camara
Using OpenAI's Realtime API and Firecrawl to Talk with Any Website
Build a real-time conversational agent that interacts with any website using OpenAI's Realtime API and Firecrawl.
Extract website data using LLMs
Learn how to use Firecrawl and Groq to extract structured data from a web page in a few lines of code.
Getting Started with Grok-2: Setup and Web Crawler Example
A detailed guide on setting up Grok-2 and building a web crawler using Firecrawl.
Launch Week I / Day 6: LLM Extract (v1)
Extract structured data from your web pages using the extract format in /scrape.
Launch Week I / Day 7: Crawl Webhooks (v1)
New /crawl webhook support. Send notifications to your apps during a crawl.
OpenAI Swarm Tutorial: Create Marketing Campaigns for Any Website
A guide to building a multi-agent system using OpenAI Swarm and Firecrawl for AI-driven marketing strategies
Build a 'Chat with website' using Groq Llama 3
Learn how to use Firecrawl, Groq Llama 3, and Langchain to build a 'Chat with your website' bot.
Scrape and Analyze Airbnb Data with Firecrawl and E2B
Learn how to scrape and analyze Airbnb data using Firecrawl and E2B in a few lines of code.