How does GPT 5.4 Nano compare to GPT-5.4 Mini?

It performs close to GPT-5.4 Mini in evaluations at a lower price point. Choose it when cost scales with the number of parallel calls.

What context window does GPT 5.4 Nano support?

400K tokens, which is substantial for a model at this price tier.

Does GPT 5.4 Nano support the verbosity parameter?

Yes. It supports verbosity and reasoning level parameters, giving you control over response detail and reasoning depth per request.

Can GPT 5.4 Nano handle complex reasoning?

For complex multi-step reasoning, GPT-5.4 Mini or the full GPT-5.4 will produce better results. GPT 5.4 Nano is optimized for simpler tasks at high volume.

How does AI Gateway handle authentication for GPT 5.4 Nano?

AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.

What are typical latency characteristics?

This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.

GPT 5.4 Nano

GPT 5.4 Nano is the smallest and most affordable model in the GPT-5.4 family, performing close to GPT-5.4 Mini in evaluations at a lower price point, built for high-volume sub-agent workflows.

ReasoningTool UseImplicit CachingWeb SearchVision (Image)File Input

import { streamText } from 'ai'

const result = streamText({
  model: 'openai/gpt-5.4-nano',
  prompt: 'Why is the sky blue?'
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out GPT 5.4 Nano by OpenAI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

About GPT 5.4 Nano

GPT 5.4 Nano became available on March 17, 2026 on AI Gateway as the smallest and most affordable model in the GPT-5.4 family. It performs close to GPT-5.4 Mini in evaluations while costing less per token, making it well-suited for high-volume use cases where cost scales with the number of parallel calls.

The model supports verbosity and reasoning level parameters, giving you control over how much the model reasons before answering. It's built for sub-agent workflows where multiple smaller models coordinate on parts of a larger task, and its price point makes per-request inference viable at the highest traffic levels.

With a context window of 400K tokens, GPT 5.4 Nano can process substantial inputs even when outputs remain short. For classification, routing, lightweight code checks, and batch processing at scale, it provides GPT-5.4 generation quality at the lowest cost in the family.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

400K

0.6s

21tps

$0.20/M

$1.25/M

Read:$0.02/M

Write:—

$10.00/K

+ input costs

—

03/17/2026

Legal:Terms

•

Privacy

400K

0.8s

71tps

$0.20/M

$1.25/M

Read:$0.02/M

Write:—

$14/K

+ input costs

—

03/17/2026

More models by OpenAI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

0.8s

83tps

$5.00/M

$30.00/M

Read:

$0.5/M

Write:

—

$10.00/K

+ input costs

—

04/24/2026

400K

3.0s

271tps

$0.75/M

$4.50/M

Read:$0.07/M

Write:—

$10.00/K

+ input costs

—

03/17/2026

1.1M

0.7s

60tps

$2.50/M

$15.00/M

Read:

$0.25/M

Write:

—

$10.00/K

+ input costs

—

03/05/2026

128K

0.7s

94tps

$1.25/M

$10.00/M

Read:$0.13/M

Write:—

$10.00/K

+ input costs

—

11/12/2025

400K

3.6s

144tps

$0.25/M

$2.00/M

Read:$0.03/M

Write:—

$14/K

+ input costs

—

08/07/2025

131K

0.1s

1437tps

$0.35/M

$0.75/M

Read:$0.25/M

Write:—

—

08/05/2025

What To Consider When Choosing a Provider

Configuration: GPT 5.4 Nano performs close to GPT-5.4 Mini in evaluations at a lower price point. Choose it when cost scales with the number of parallel calls.
Configuration: Like GPT-5.4 Mini, it supports verbosity and reasoning level parameters, giving you control over response detail and reasoning depth per request.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GPT 5.4 Nano

Best For

High-volume sub-agent workflows: Parallel calls where cost scales with the number of agents
Classification and routing: Sentiment analysis, intent detection, and request triage at high volume
Lightweight code tasks: Simple code checks, unused import detection, and quick validations
Cost-sensitive batch processing: Large-scale inference where per-call cost is the primary constraint
Pipeline preprocessing: Fast filtering and extraction steps that feed into larger model calls

Consider Alternatives When

Higher capability needed: GPT-5.4 mini for agentic tasks that require more reliable multi-step completion
Maximum quality: GPT-5.4 or GPT-5.4 pro for complex reasoning and analysis
Specialized coding: GPT-5.3 codex for autonomous software engineering
Deep deliberation: O3 for chain-of-thought reasoning on hard problems

Conclusion

GPT 5.4 Nano brings GPT-5.4 generation quality to the most affordable tier. For high-volume sub-agent workflows, classification, and batch processing through AI Gateway, it provides near-mini performance at a fraction of the cost.

Frequently Asked Questions

How does GPT 5.4 Nano compare to GPT-5.4 Mini?
It performs close to GPT-5.4 Mini in evaluations at a lower price point. Choose it when cost scales with the number of parallel calls.
What context window does GPT 5.4 Nano support?
400K tokens, which is substantial for a model at this price tier.
Does GPT 5.4 Nano support the verbosity parameter?
Yes. It supports verbosity and reasoning level parameters, giving you control over response detail and reasoning depth per request.
What tasks is GPT 5.4 Nano designed for?
High-volume sub-agent workflows, classification, routing, lightweight code checks, and batch processing where per-call cost is the dominant concern.
Can GPT 5.4 Nano handle complex reasoning?
For complex multi-step reasoning, GPT-5.4 Mini or the full GPT-5.4 will produce better results. GPT 5.4 Nano is optimized for simpler tasks at high volume.
How does AI Gateway handle authentication for GPT 5.4 Nano?
AI Gateway accepts a single API key or OIDC token for all requests. You don't embed OpenAI credentials in your application; AI Gateway routes and authenticates on your behalf.
What are typical latency characteristics?
This page shows live throughput and time-to-first-token metrics measured across real AI Gateway traffic.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

GPT 5.4 Nano

Playground

About GPT 5.4 Nano

Providers

More models by OpenAI

What To Consider When Choosing a Provider

When to Use GPT 5.4 Nano

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions

Playground

About GPT 5.4 Nano

Providers

More models by OpenAI