Getting Started with OpenAI's Predicted Outputs for Faster LLM Responses

Introducing Browser Sandbox - Give your agents a secure, fully managed browser environment Read more →

Get started

Ready to build?

Start getting Web Data for free and scale seamlessly as your project expands. No credit card needed.

Eric Ciarla

Nov 05, 2024

Getting Started with OpenAI's Predicted Outputs for Faster LLM Responses image

Leveraging the full potential of Large Language Models (LLMs) often involves balancing between response accuracy and latency. OpenAI's new Predicted Outputs feature introduces a way to significantly reduce response times by informing the model about the expected output in advance.

In this article, we'll explore how to use Predicted Outputs with the GPT-4o and GPT-4o-mini models to make your AI applications super fast 🚀. We'll also provide a practical example of transforming blog posts into SEO-optimized content, a powerful use case enabled by this feature.

What Are Predicted Outputs?

Predicted Outputs allow you to provide the LLM with an anticipated output, especially useful when most of the response is known ahead of time. For tasks like rewriting text with minor modifications, this can drastically reduce the time it takes for the model to generate the desired result.

Why Use Predicted Outputs?

By supplying the model with a prediction of the output, you:

Reduce Latency: The model can process and generate responses faster because it doesn't need to generate the entire output from scratch.
Enhance Efficiency: Useful when you can reasonably assume that large portions of the output will remain unchanged.

Limitations to Keep in Mind

While Predicted Outputs are powerful, there are some limitations:

Supported only with GPT-4o and GPT-4o-mini models.
Certain API parameters are not supported, such as n values greater than 1, logprobs, presence_penalty greater than 0, among others.

How to Use Predicted Outputs

Let's dive into how you can implement Predicted Outputs in your application. We'll walk through an example where we optimize a blog post by adding internal links to relevant pages within the same website.

Prerequisites

Make sure you have the following installed:

pip install firecrawl-py openai

Step 1: Set Up Your Environment

Initialize the necessary libraries and load your API keys.

import os
import json
from firecrawl import FirecrawlApp
from dotenv import load_dotenv
from openai import OpenAI
 
# Load environment variables
load_dotenv()
 
# Retrieve API keys from environment variables
firecrawl_api_key = os.getenv("FIRECRAWL_API_KEY")
openai_api_key = os.getenv("OPENAI_API_KEY")
 
# Initialize the FirecrawlApp and OpenAI client
app = FirecrawlApp(api_key=firecrawl_api_key)
client = OpenAI(api_key=openai_api_key)

Step 2: Scrape the Blog Content

We'll start by scraping the content of a blog post that we want to optimize.

# Get the blog URL (you can input your own)
blog_url = "https://www.firecrawl.dev/blog/how-to-use-openai-o1-reasoning-models-in-applications"
 
# Scrape the blog content in markdown format
blog_scrape_result = app.scrape_url(blog_url, params={'formats': ['markdown']})
blog_content = blog_scrape_result.get('markdown', '')

Step 3: Map the Website for Internal Links

Next, we'll get a list of other pages on the website to which we can add internal links.

# Extract the top-level domain
top_level_domain = '/'.join(blog_url.split('/')[:3])
 
# Map the website to get all internal links
site_map = app.map_url(top_level_domain)
site_links = site_map.get('links', [])

Step 4: Prepare the Prompt and Prediction

We'll create a prompt instructing the model to add internal links to the blog post and provide the original content as a prediction.

prompt = f"""
You are an AI assistant helping to improve a blog post.
 
Here is the original blog post content:
 
{blog_content}
 
Here is a list of other pages on the website:
 
{json.dumps(site_links, indent=2)}
 
Please revise the blog post to include internal links to some of these pages where appropriate. Make sure the internal links are relevant and enhance the content.
 
Only return the revised blog post in markdown format.
"""

Step 5: Use Predicted Outputs with the OpenAI API

Now, we'll call the OpenAI API using the prediction parameter to provide the existing content.

completion = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {
            "role": "user",
            "content": prompt
        }
    ],
    prediction={
        "type": "content",
        "content": blog_content
    }
)
revised_blog_post = completion.choices[0].message.content

Step 6: Compare the Original and Revised Content

Finally, we'll compare the number of links in the original and revised blog posts to see the improvements.

import re
 
def count_links(markdown_content):
    return len(re.findall(r'\[.*?\]\(.*?\)', markdown_content))
 
original_links_count = count_links(blog_content)
revised_links_count = count_links(revised_blog_post)
 
print(f"Number of links in the original blog post: {original_links_count}")
print(f"Number of links in the revised blog post: {revised_links_count}")

Conclusion

By utilizing Predicted Outputs, you can significantly speed up tasks where most of the output is known, such as content reformatting or minor edits. This feature is a game-changer for developers looking to optimize performance without compromising on the quality of the output.

That's it! In this article, we've shown you how to get started with Predicted Outputs using OpenAI's GPT-4o models. Whether you're transforming content, correcting errors, or making minor adjustments, Predicted Outputs can make your AI applications faster and more efficient.

References

Eric Ciarla @ericciarla

Cofounder and CMO of Firecrawl

About the Author

Eric Ciarla is a co-founder of Firecrawl. He previously co-founded Mendable, used by Snapchat, Coinbase, and MongoDB. He's been building products in the AI and data space since 2022.

More articles by Eric Ciarla

Introducing PDF Parser v2: Faster Extraction with Auto Mode Browser Sandbox: Secure Environments for Agents to Interact with the Web Branding Format v2: Improved Logo Extraction Extract Web Data at Scale With Parallel Agents Introducing the Firecrawl Skill and CLI - Give Agents Real-Time Web Data How Credal Extracts 6M+ URLs Monthly to Power Production AI Agents Introducing Spark 1 Pro and Spark 1 Mini Introducing /agent: Gather Data Wherever It Lives on the Web How Retell Keeps AI Phone Agents Answering from Live Documentation with Firecrawl Introducing Firecrawl v2.5 - The World's Best Web Data API

FOOTER

The easiest way to extract
data from the web

                                                                                                                                                 
                                                                                                                                                 
                                                                                                                                                 
                                                                                                                                                 
                                                                                                                                                 
                                                                .     .                                                                          
                                                               ..     ..+                                                                        
                                                                      .:.                                                                        
                                                               ..     ..         .::                                                             
                                                               +..   ..:          :.                                                             
                                                             .:..::.  ..          ..                                                             
                                                             .--:::.  ..     ...  .:.           ..                                               
                                            ..               .:+=-::.:.     . ...-.::.         ..                                                
                                            ::....           .:--+::..: ......:+....:.     :.. ..                                                
                                            .......            ::-=::::     ..:-:-...:     .--..::          .........                            
                            ..  .             . .              ..::-:-..      .-+-:::..    ...::::.        .: ...::.:..                          
                       .  -... ....:           .   .            .--=+-::.      :-=-:....  .  .:..::      .:---:::::-::....                       
                       ..::........::=.....    ...:-..        .:-=--+=-:.       ..--:..=::.... . .:..  ..:---::::---=:::..:...                   
              ..........::::.:::::::-::.-..  ...::--==:.      ..-::-+==-:...      .-::.......   ..--:. ..:=+==.---=-+-:::::::-..                 
          . .....::......:: ::::-::.---=+-:..::-+==++X=-:.   ..:-::-=-== ---..   .:.--::..       .:-==::=--X==-----====--::+:::+...              
          ..-....-:..::-::=-=-:-::--===++=-==-----== X+=-:.::-==----+==+XX+=-::.:+--==--::.      .:-+X=----+X=-=------===--::-:...:. ....        
          ....::::...:-:-==+++=++==+++XX++==++--+-+==++++=-===+=---:-==+X:XXX+=-:-=-==++=-:.     .:-=+=- -=X+X+===+---==--==--:..::...+....+     
         ..:::---.::.---=+==XXXXXXXX+XX++==++===--+===:+X+====+=--::--=+XXXXXXX+==++==+XX+=: ::::--=+++X++X+XXXX+=----==++.+=--::+::::+. ::.=... 
         .:::-==-------=X+++XXXXXXXXXXX++==++.==-==-:-==+X++==+=-=--=++++X++:X:X+++X+-+X X+=---=-==+=+++XXXXX+XX=+=--=X++XXX==---::-+-::::.:..-..

Backed by

Y Combinator

Linkedin Github YouTube

SOC II · Type 2

AICPA

SOC 2

X (Twitter)

Discord