Description
AMA (Ask Me Anything) AI - Chrome Extension
A Chrome extension that lets you ask questions about any website youβre visiting, powered by Firecrawl and GPT-4o-mini. This extension crawls web pages, processes their content, and enables natural language interactions about the siteβs content.
Install from Chrome Web Store
π Features
- π€ Ask questions about any websiteβs content in natural language
- π Smart web crawling that respects site structure and robots.txt
- π¬ Chat-like interface with streaming responses
- π§ Context-aware responses with references to specific pages
- π Maintains conversation history per domain
- β‘ Real-time streaming responses from GPT-4o-mini
- βοΈ Highly configurable crawling parameters
- π Secure API key management
- π Markdown formatting support in responses
- π Automatic link references in answers
π How It Works
Crawling Process
-
Initial Scan: When you click βStart Crawlβ, the extension:
- Validates your API keys
- Checks the current domain
- Initializes a new conversation history
- Starts the Firecrawl process
-
Progress Tracking:
- Shows a progress bar with real-time updates
- Displays number of pages crawled
- Indicates crawling status and completion
-
Content Processing:
- Converts HTML to clean markdown format
- Extracts page titles and URLs
- Maintains source references for citations
- Truncates content to stay within API limits
Question-Answering Flow
-
Content Preparation:
- Organizes crawled content by relevance
- Maintains page structure and hierarchy
- Preserves source URLs for citations
-
AI Processing:
- Streams requests to GPT-4o-mini
- Maintains conversation context
- Generates responses with source citations
-
Response Handling:
- Real-time streaming of answers
- Markdown formatting for readability
- Automatic link insertion
- Source reference preservation
βοΈ Detailed Configuration Options
Firecrawl Settings
Basic Configuration
- API Key: Your Firecrawl authentication key
- Max Depth (default: 3):
- Controls how many links deep the crawler will go
- Higher values explore more nested pages
- Recommended range: 1-5 for optimal performance
Crawling Parameters
-
Page Limit (default: 50):
- Maximum number of pages to crawl
- Higher limits allow more comprehensive coverage
- Consider API usage when adjusting
- Range: 1-1000 pages
-
Allow Backward Links (default: true):
- When enabled, crawler follows links to previously visited domains
- Useful for sites with cross-referenced content
- Disable to stay within single domain
Performance Settings
-
Timeout (default: 20000ms):
- Maximum time to wait for each page
- Prevents hanging on slow-loading pages
- Recommended range: 5000-30000ms
-
Wait For (default: 2000ms):
- Delay between page requests
- Helps respect server rate limits
- Adjust based on siteβs robustness
- Range: 0-5000ms
Content Processing Settings
- Max Content Length (default: 250,000 characters):
- Maximum characters sent to GPT-4o-mini
- Balances comprehensiveness with API limits
- Range: 1-500,000 characters
- Higher values may increase API costs
Model Settings
- Model Selection:
gpt-4o-mini
: Faster, more concise responsesgpt-4o
: More detailed, nuanced answers, much more expensive- Choose based on your needs for speed vs. detail and cost
π§ Advanced Usage
Conversation Management
The extension maintains separate conversation histories for each domain:
{
"example.com": [
{"role": "user", "content": "What is this site about?"},
{"role": "assistant", "content": "Based on [Home Page](https://example.com), this site..."},
// Additional messages...
]
}
Content Formatting
The extension processes content in multiple stages:
-
HTML Processing:
// Example of content processing { "title": "Page Title", "url": "https://example.com/page", "content": "Processed markdown content...", "metadata": { "sourceURL": "https://example.com/page", "crawlTime": "2024-01-01T00:00:00Z" } }
-
Content Organization:
- Groups related content
- Maintains hierarchy
- Preserves source references
API Response Handling
Responses are streamed in real-time:
// Example streaming response format
{
"role": "assistant",
"content": "According to [About Page](https://example.com/about)...",
"references": [
{"title": "About Page", "url": "https://example.com/about"},
// Additional references...
]
}
Related Templates
Explore more templates similar to this one
Map a documentation website
Zed.dev Crawl
The first step of many to create an LLM-friendly document for Zed's configuration.
Developers.campsite.com Crawl
o3 mini Company Researcher
This Python script integrates SerpAPI, OpenAI's O3 Mini model, and Firecrawl to create a comprehensive company research tool. The workflow begins by using SerpAPI to search for company information, then leverages the O3 Mini model to intelligently select the most relevant URLs from search results, and finally employs Firecrawl's extraction API to pull detailed information from those sources. The code includes robust error handling, polling mechanisms for extraction results, and clear formatting of the output, making it an efficient tool for gathering structured company information based on specific user objectives.
o1 Web Crawler
Docs.google.com Scrape
test
Llama 4 Maverick Web Extractor
This Python script integrates SerpAPI, Together AI's Llama 4 Maverick model (specifically "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"), and Firecrawl to extract structured company information. The workflow first uses SerpAPI to search for company data, then employs the Llama 4 model to intelligently select the most relevant URLs (prioritizing official sources and limiting to 3 URLs), and finally leverages Firecrawl's extraction API to pull detailed information from those sources. The code includes robust error handling, logging, and polling mechanisms to ensure reliable data extraction across the entire process.