How does GPT 5.4 improve over GPT-5.2?

It extends the agentic and reasoning leaps from GPT-5.3 Codex to all domains. It's also faster and more token-efficient, reducing cost per task.

What context window does GPT 5.4 support?

1.1M tokens, supporting extensive document and codebase processing.

What types of workflows does GPT 5.4 handle well?

Complex multi-step workflows involving tools, research, and pulling from multiple sources. It also handles knowledge work like reports, presentations, and analysis.

Does GPT 5.4 support function calling and structured outputs?

Yes. It supports the full API feature set including function calling, structured outputs, vision, and system messages.

Should I migrate from GPT-5.2?

If your application benefits from improved agentic reasoning and you want better speed and token efficiency, yes. Test with your specific workloads to confirm the improvement.

How does AI Gateway handle authentication for GPT 5.4?

AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

What are typical latency characteristics?

This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.

GPT 5.4

GPT 5.4 is the standard tier of the GPT-5.4 model family, extending the agentic and reasoning capabilities of GPT-5.3 Codex to all domains including knowledge work, multi-step workflows, and analysis.

ReasoningTool UseVision (Image)File InputImplicit CachingWeb Search

import { streamText } from 'ai'

const result = streamText({
  model: 'openai/gpt-5.4',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out GPT 5.4 by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

About GPT 5.4

GPT 5.4 became available on March 5, 2026 on AI Gateway as the standard tier of the GPT-5.4 model family. It extends the agentic and reasoning leaps introduced in GPT-5.3 Codex to all domains, including knowledge work like reports, spreadsheets, presentations, and analysis.

The model handles complex multi-step workflows more reliably than previous generations, including tasks that involve tools, research, and pulling from multiple sources. It's faster and more token-efficient than GPT-5.2, delivering better results at lower cost per task.

With the context window of 1.1M tokens and the full API feature set, GPT 5.4 supports text, image, and mixed-modality inputs. If you're starting a new project or upgrading from an earlier GPT-5.x model, it's the default starting point for general-purpose work.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

1.1M

0.8s

57tps

$2.50/M

$15.00/M

Read:

$0.25/M

Write:

—

$10.00/K

+ input costs

—

03/05/2026

Legal:Terms

•

Privacy

0.6s

59tps

$2.50/M

$15.00/M

Read:

$0.25/M

Write:

—

$14/K

+ input costs

—

03/05/2026

More models by OpenAI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

1.0s

60tps

$5.00/M

$30.00/M

Read:

$0.5/M

Write:

—

$10.00/K

+ input costs

—

04/24/2026

400K

2.8s

253tps

$0.75/M

$4.50/M

Read:$0.07/M

Write:—

$10.00/K

+ input costs

—

03/17/2026

400K

0.6s

55tps

$0.20/M

$1.25/M

Read:$0.02/M

Write:—

$10.00/K

+ input costs

—

03/17/2026

128K

0.6s

98tps

$1.25/M

$10.00/M

Read:$0.13/M

Write:—

$10.00/K

+ input costs

—

11/12/2025

400K

3.7s

113tps

$0.25/M

$2.00/M

Read:$0.03/M

Write:—

$14/K

+ input costs

—

08/07/2025

131K

0.1s

1442tps

$0.35/M

$0.75/M

Read:$0.25/M

Write:—

—

08/05/2025

What To Consider When Choosing a Provider

Configuration: GPT 5.4 brings the agentic and reasoning leaps from GPT-5.3 Codex into all domains, not just coding.
Configuration: It's faster and more token-efficient than GPT-5.2, meaning lower costs and shorter latencies on comparable tasks.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GPT 5.4

Best For

Complex multi-step workflows: Tasks involving tools, research, and pulling from multiple sources
Knowledge work: Reports, spreadsheets, presentations, and analysis across business domains
Advanced code generation: Strong coding performance with GPT-5.4 generation reasoning improvements
Agentic applications: Autonomous agents that coordinate tools, research, and multi-source workflows
General-purpose AI: New projects that benefit from GPT-5.4 generation capability

Consider Alternatives When

Cost optimization: GPT-5.4 mini for production workloads where cost efficiency matters
High-volume lightweight tasks: GPT-5.4 nano for classification, routing, and sub-agent workflows
Extended reasoning: GPT-5.4 pro for maximum performance on the most complex tasks
Pure chain-of-thought: O3 for mathematical and scientific reasoning tasks

Conclusion

GPT 5.4 brings agentic reasoning to all domains with improved speed and efficiency over GPT-5.2, available through AI Gateway. It is the standard tier of the GPT-5.4 family.

Frequently Asked Questions

How does GPT 5.4 improve over GPT-5.2?
It extends the agentic and reasoning leaps from GPT-5.3 Codex to all domains. It's also faster and more token-efficient, reducing cost per task.
What context window does GPT 5.4 support?
1.1M tokens, supporting extensive document and codebase processing.
What types of workflows does GPT 5.4 handle well?
Complex multi-step workflows involving tools, research, and pulling from multiple sources. It also handles knowledge work like reports, presentations, and analysis.
Does GPT 5.4 support function calling and structured outputs?
Yes. It supports the full API feature set including function calling, structured outputs, vision, and system messages.
Should I migrate from GPT-5.2?
If your application benefits from improved agentic reasoning and you want better speed and token efficiency, yes. Test with your specific workloads to confirm the improvement.
How does AI Gateway handle authentication for GPT 5.4?
AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.
What are typical latency characteristics?
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

GPT 5.4

Playground

About GPT 5.4

Providers

More models by OpenAI

What To Consider When Choosing a Provider

When to Use GPT 5.4

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions

Playground

About GPT 5.4

Providers

More models by OpenAI