Generate LLMs.txt files with CLI
Repository

Generate LLMs.txt files with CLI

Generate LLMs.txt files from any website using the CLI + Firecrawl

Generation

Description

generate-llmstxt

A simple NPX package that generates LLMs.txt files using the Firecrawl API. Specify the URL you want and it creates two files in your specified output directory (defaults to ā€˜public’ folder):

  • llms.txt: Contains a summary of the LLM-related content
  • llms-full.txt: Contains the full text content

Usage

You can run this package using NPX without installing it. There are two ways to provide your Firecrawl API key:

1. Using Command Line Arguments

npx generate-llmstxt --api-key YOUR_FIRECRAWL_API_KEY

2. Using Environment Variables

Create a .env file in your project root and add your API key:

FIRECRAWL_API_KEY=your_api_key_here

Then run the command without the —api-key option:

npx generate-llmstxt

Options

  • -k, --api-key <key> (optional if set in .env): Your Firecrawl API key
  • -u, --url <url> (optional): URL to analyze (default: https://example.com)
  • -m, --max-urls <number> (optional): Maximum number of URLs to analyze (default: 50)
  • -o, --output-dir <path> (optional): Output directory path (default: ā€˜public’)

Examples

# Using command line argument with default output directory
npx generate-llmstxt -k your_api_key -u https://your-website.com -m 20

# Using .env file with default output directory
npx generate-llmstxt -u https://your-website.com -m 20

# Specifying a custom output directory
npx generate-llmstxt -k your_api_key -u https://your-website.com -o custom/path/to/output

# Using .env file and custom output directory
npx generate-llmstxt -u https://your-website.com -o content/llms

Requirements

  • Node.js 14 or higher
  • A valid Firecrawl API key (via command line or .env file)

Output

The package will create two files in your specified output directory (defaults to ā€˜public’):

  1. llms.txt: Contains a summary of the LLM-related content
  2. llms-full.txt: Contains the full text content

Related Templates

Explore more templates similar to this one

Playground

Map a documentation website

/map
Playground

Zed.dev Crawl

The first step of many to create an LLM-friendly document for Zed's configuration.

/crawl
Playground

Developers.campsite.com Crawl

/crawl
Snippet

o3 mini Company Researcher

This Python script integrates SerpAPI, OpenAI's O3 Mini model, and Firecrawl to create a comprehensive company research tool. The workflow begins by using SerpAPI to search for company information, then leverages the O3 Mini model to intelligently select the most relevant URLs from search results, and finally employs Firecrawl's extraction API to pull detailed information from those sources. The code includes robust error handling, polling mechanisms for extraction results, and clear formatting of the output, making it an efficient tool for gathering structured company information based on specific user objectives.

o3 mini
Research
Snippet

o1 Web Crawler

o1
Crawler
Playground

Docs.google.com Scrape

/scrape
Playground

test

/scrape
Snippet

Llama 4 Maverick Web Extractor

This Python script integrates SerpAPI, Together AI's Llama 4 Maverick model (specifically "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"), and Firecrawl to extract structured company information. The workflow first uses SerpAPI to search for company data, then employs the Llama 4 model to intelligently select the most relevant URLs (prioritizing official sources and limiting to 3 URLs), and finally leverages Firecrawl's extraction API to pull detailed information from those sources. The code includes robust error handling, logging, and polling mechanisms to ensure reliable data extraction across the entire process.

Llama 4
Extractor