Generating Images from Website Content with Imagen 4 & Gemini 2.5 Flash

Generative AI offers powerful new ways to process and repurpose digital content. This tutorial explores a practical application transforming the content of any website into a unique, descriptive image. This project also highlights a key way our users leverage Firecrawl for constrained content generation, a process where structured web data fuels the creation of artifacts like images, PowerPoints, or even other websites. We’ll demonstrate how to build a Next.js application that orchestrates Firecrawl for content extraction, Google’s Gemini 2.5 Flash for intelligent prompt generation, and Google’s Imagen 4 (via Fal.ai) for image synthesis.

At Firecrawl, we’re focused on execution and turning ideas into production quickly. This project exemplifies how modern AI tools can be combined to build innovative features in days, not months. We’re sharing our process, building in public as we go.

This guide provides a technical walkthrough, focusing on the substance of integrating these cutting-edge AI models.

Use Cases for Website-to-Image Generation

Converting website content into images has several practical applications:

Automated Visuals for Content: Generate unique header images for blog posts or social media updates based on the article’s content.
Brand Exploration: Create visual interpretations of a website’s core message for branding exercises.
Enhanced Content Previews: Offer a visual summary or “snapshot” of a webpage’s purpose.
Developer Tooling: Integrate into CMS or marketing platforms for quick visual asset creation.

This approach leverages the semantic understanding of language models and the creative capabilities of image generation models.

Technology Stack

This project utilizes the following services and libraries. We chose these for their specific strengths and ease of integration, allowing for rapid development:

Firecrawl: For reliable website content extraction. Firecrawl efficiently handles JavaScript rendering, SPAs, and dynamic content, providing clean, structured Markdown – essential LLM-ready data.
Google Gemini 2.5 Flash: Selected for its speed and advanced reasoning. It processes the extracted Markdown to generate concise and effective image prompts. We’ll also show how its “thinking steps” feature can offer insight into the generation process.
Google Imagen 4 (via Fal.ai): A high-fidelity text-to-image model. Fal.ai provides a straightforward API for accessing Imagen 4, simplifying the integration.
Next.js: A React framework for building full-stack web applications, used here for both frontend UI and backend API routes.
Shadcn/ui & Tailwind CSS: For the user interface components.

Step-by-Step Implementation

Let’s outline the architecture and build process.

Prerequisites

Node.js (v18 or later) with npm, yarn, or pnpm.
API Keys:
- Firecrawl
- Google AI Studio (for Gemini 2.5 Flash)
- Fal.ai (for Imagen 4)

Project Setup

Initialize a Next.js project or use our provided example:

# To use our template:
git clone https://github.com/mendableai/firecrawl-app-examples.git
cd /url-to-image-imagen4-gemini-flash

If starting fresh:

npx create-next-app@latest url-to-image-generator
cd url-to-image-generator

Install dependencies:

npm install @mendable/firecrawl-js @ai-sdk/google @fal-ai/client ai

Backend API Routes

Three API routes in app/api/ handle the core logic:

/api/scrape (Firecrawl Content Extraction) This endpoint accepts a URL and returns its content as Markdown.

// app/api/scrape/route.ts
import { NextRequest, NextResponse } from 'next/server';
import FirecrawlApp from '@mendable/firecrawl-js';

export async function POST(request: NextRequest) {
  const { url } = await request.json();

  if (!url) {
    return NextResponse.json({ error: 'URL is required.' }, { status: 400 });
  }

  try {
    const app = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY });
    const scrapeResult = await app.scrapeUrl(url, { pageOptions: { format: 'markdown' } });
    
    if (scrapeResult.success && scrapeResult.data?.markdown) {
      return NextResponse.json({ markdown: scrapeResult.data.markdown });
    } else {
      return NextResponse.json({ error: 'Failed to scrape URL.' }, { status: 500 });
    }
  } catch (error: any) {
    return NextResponse.json({ error: `Error: ${error.message}` }, { status: 500 });
  }
}

/api/gemini (Gemini 2.5 Flash Prompt Generation) This endpoint takes the extracted Markdown and uses Gemini 2.5 Flash to generate an image prompt.

// app/api/gemini/route.ts
import { google } from '@ai-sdk/google';
import { streamText } from 'ai'; 

export async function POST(req: Request) {
  const { prompt: websiteContent } = await req.json(); 

  const fullPrompt = `Analyze the following website content and generate a single, coherent image prompt sentence (15-25 words maximum). The prompt must capture the main product/service, incorporate the website's actual tagline or headline if discernible and concise (in quotes), mention a key visual element, and be a direct instruction for an image generator. Example: "Create an image of a sleek coffee subscription box with a ceramic mug, featuring the tagline 'Morning Brew Delivered'." Website Content: ${websiteContent}`;
  
  try {
    const result = await streamText({
      model: google('gemini-2.5-flash-preview-05-20'),
      prompt: fullPrompt,
    });
    
    return result.toAIStreamResponse();
  } catch (error: any) {
    return new Response(JSON.stringify({ error: `Failed to generate prompt: ${error.message}` }), { status: 500 });
  }
}

/api/imagen4 (Imagen 4 Image Generation via Fal.ai) This endpoint receives the generated prompt and calls Fal.ai’s Imagen 4 endpoint.

// app/api/imagen4/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { fal } from '@fal-ai/client';

fal.config({ credentials: process.env.FAL_KEY });

export async function POST(request: NextRequest) {
  const { prompt } = await request.json();

  if (!prompt) {
    return NextResponse.json({ error: 'Prompt is required.' }, { status: 400 });
  }

  try {
    const result: any = await fal.subscribe("fal-ai/imagen4/preview", {
      input: { prompt: prompt },
    });

    if (result?.images?.[0]?.url) {
      const imageResponse = await fetch(result.images[0].url);
      const imageBuffer = await imageResponse.arrayBuffer();
      const imageBase64 = Buffer.from(imageBuffer).toString('base64');
      
      return NextResponse.json({ 
        imageBase64: imageBase64,
        contentType: result.images[0].content_type || 'image/png'
      });
    } else {
      return NextResponse.json({ error: 'Image generation failed.' }, { status: 500 });
    }
  } catch (error: any) {
    return NextResponse.json({ error: `Error: ${error.message}` }, { status: 500 });
  }
}

Frontend Implementation Overview

The frontend guides the user through a multi-step process:

Step 1: Enter URL: User inputs the target website URL.
Step 2: Select Style: User chooses a predefined artistic style. These are prompt fragments appended to the content-derived prompt.
Step 3: Website Content Processing: Firecrawl extracts content. UI indicates “Scraping with Firecrawl 🔥…”
Step 4: Generate & Edit Prompt:
- Calls /api/gemini.
- Displays “Thinking Steps” from Gemini 2.5 Flash.
- Shows AI-generated “Content Prompt” and selected “Style Prompt” in editable areas.
- A preview of the final combined prompt is shown.
Step 5: Generate Image: Loading indicator: “Generating image with Imagen 4…”
Step 6: View Image: The generated image is displayed with download/regenerate options. The full prompt is shown.

Connecting Frontend to Backend

Standard fetch API calls are used from React components to interact with these backend routes. Component state (useState, useEffect) manages data flow, loading states, and error handling.

// Example: Initiating content scraping in a React component
const handleScrapeAndProceed = async () => {
  setLoadingMessage("Scraping with Firecrawl 🔥..."); 
  try {
    const headers = { 'Content-Type': 'application/json' };
    if (sessionFirecrawlApiKey) { 
      headers['X-Firecrawl-API-Key'] = sessionFirecrawlApiKey;
    }

    const response = await fetch('/api/scrape', {
      method: 'POST',
      headers,
      body: JSON.stringify({ url: enteredUrl }),
    });
    const data = await response.json();

    if (!response.ok || data.error) throw new Error(data.error || 'Scraping failed');
    
    setWebsiteMarkdown(data.markdown); 
    setCurrentStep(nextStep); 
  } catch (err) {
    // Handle error appropriately
  }
};

Technical Considerations & Learnings

Prompt Engineering for Gemini 2.5 Flash: The effectiveness of the generated image heavily relies on the prompt. For Gemini, clear instructions are vital. Define the desired output, specify focus (main product/service, actual website tagline, key visual elements from extracted content), and mandate format (direct instruction for an image generator). The backend prompt construction is key. Continuous iteration on this prompt can yield better results.
Utilizing Imagen 4 Parameters: The appended style prompts effectively guide Imagen 4’s aesthetic. Fal.ai’s endpoint for Imagen 4 may support additional parameters (e.g., aspect_ratio, negative_prompt, seed) which can be integrated into the UI for more fine-grained control.
API Key Management:
- Server-Side (Recommended for Production): Store API keys as environment variables.
- Client-Side Input (Demo): Our example allows users to input API keys for the current session. For production applications, API keys should always be handled and stored securely on the server-side.
Error Handling & User Feedback: Implement comprehensive error handling for API calls. Provide clear feedback to the user during long operations using loading states and messages. The “thinking steps” from Gemini also contribute to a better user experience.

Access the Code & Experiment

The complete source code for this “URL-to-Image Generator” application is available on GitHub:

➡️ Firecrawl App Examples - URL to Image Generator

To run the project locally:

Clone the Firecrawl repository: git clone https://github.com/mendableai/firecrawl-app-examples.git
Navigate to the example directory: cd /url-to-image-imagen4-gemini-flash

Create a .env.local file in this directory and add your API keys:

FIRECRAWL_API_KEY=fc-your-firecrawl-api-key
GOOGLE_API_KEY=your-google-ai-studio-api-key # For Gemini
FAL_KEY=your-fal-ai-api-key # For Imagen 4
# Optional UPSTASH_REDIS_REST_URL / UPSTASH_REDIS_REST_TOKEN for rate limiting

Install dependencies: npm install
Run the development server: npm run dev

The application should be accessible at http://localhost:3000/url-to-image.

Conclusion: Practical Integration of Advanced AI Models

This project demonstrates a practical workflow for integrating multiple advanced AI services—Firecrawl for data extraction, Gemini 2.5 Flash for language understanding and prompt generation, and Imagen 4 for image synthesis. By orchestrating these tools, developers can build sophisticated applications that offer novel ways to interact with and repurpose web content.

Our approach is always to try, learn, and iterate. We believe in sharing these experiments to help others build.

For further details on the individual services:

Stay tuned for more updates and explorations from the Firecrawl team! 🔥