Snippet

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

Code Snippet

O4 Mini Web Crawler

O4 Mini Web Crawler intelligently maps and scrapes websites using Firecrawl API, then applies OpenAI's o4-mini model to identify and extract relevant information based on user objectives.

o4-mini

Crawler

Description

O4 Mini Web Crawler

A simple web crawler that uses Firecrawl and OpenAI’s o4-mini model to search websites based on user objectives.

Features

Maps websites to find relevant URLs
Uses AI to rank URLs by relevance to the objective
Scrapes content and analyzes it with o4-mini
Returns structured data when objectives are met

Prerequisites

Python 3.6+
Firecrawl API key
OpenAI API key

Installation

Clone this repository
Install the required packages:
```
pip install -r requirements.txt
```
Copy .env.example to .env and fill in your API keys:
```
cp .env.example .env
```

Usage

Run the script:

python o4-mini-web-crawler.py

You will be prompted to:

Enter a website URL to crawl
Define your objective (what information you’re looking for)

The crawler will then:

Map the website to find relevant URLs
Rank the most relevant pages
Scrape and analyze the content
Return structured data if the objective is met

Example

Enter the website to crawl: https://example.com
Enter your objective: Find the company's headquarters address

The crawler will search for pages likely to contain this information, analyze them, and return the address in a structured format.

License

MIT

Related Templates

Explore more templates similar to this one

Playground

Top Italian Restaurants in SF

Search for websites that contain the top italian restaurants in SF. With page content

New

/search

Playground

Quotes.toscrape.com Scrape

/scrape

Playground

Zed.dev Crawl

The first step of many to create an LLM-friendly document for Zed's configuration.

/crawl

Playground

Developers.campsite.com Crawl

/crawl

Snippet

o3 mini Company Researcher

This Python script integrates SerpAPI, OpenAI's O3 Mini model, and Firecrawl to create a comprehensive company research tool. The workflow begins by using SerpAPI to search for company information, then leverages the O3 Mini model to intelligently select the most relevant URLs from search results, and finally employs Firecrawl's extraction API to pull detailed information from those sources. The code includes robust error handling, polling mechanisms for extraction results, and clear formatting of the output, making it an efficient tool for gathering structured company information based on specific user objectives.

o3 mini

Research

Snippet

o1 Web Crawler

Crawler

Playground

Docs.google.com Scrape

/scrape

Playground

test

/scrape