What is adaptive thinking and how does it differ from extended thinking?

Adaptive thinking (`thinking: { type: 'adaptive' }`) lets the model decide when and how much to reason internally. Unlike extended thinking, you don't set a fixed thinking token budget. The model calibrates reasoning depth per request.

How is the context window of 1M tokens significant for Opus 4.6?

This is the first Opus model to support 1M tokens. Previous Opus models had a 200K context limit. This enables Opus-level analysis of entire large codebases, extensive document collections, or long agent histories in a single request.

Can Opus 4.6 interleave thinking and tool calls in one response?

Yes. Claude Opus 4.6 can interleave thinking and tool calls within a single response, reasoning about a problem, calling a tool, reasoning about the result, and calling another tool, all in one turn.

How do I configure adaptive thinking and effort together in the AI SDK?

Under `providerOptions.anthropic` in the AI SDK, set `thinking.type` to `adaptive` and pass an `effort` level (for example, `max`). Set the model to `anthropic/claude-opus-4.6`.

Does the effort parameter work the same way in Opus 4.6 as in Opus 4.5?

Yes, the effort parameter controls overall token usage across all token types, defaults to a high level, and operates independently of the thinking configuration. Both parameters can be set separately.

What real-world work is Opus 4.6 specifically designed to handle?

Programming, analysis, and creative tasks across the development lifecycle. The hybrid reasoning architecture lets you use extended thinking for complex problems while keeping latency low for straightforward queries.

Claude Opus 4.6

Claude Opus 4.6 is the first Opus model with a context window of 1M tokens, introduces adaptive thinking for model-decided reasoning depth, supports interleaved thinking and tool calls in a single response, and delivers equal or better performance than fixed extended thinking across programming, analysis, and creative tasks.

Tool UseReasoningVision (Image)File InputExplicit CachingWeb Search

import { streamText } from 'ai'

const result = streamText({
  model: 'anthropic/claude-opus-4.6',
  prompt: 'Why is the sky blue?',
  providerOptions: {
    anthropic: {
      speed: 'fast',
    },
    gateway: {
      only: ['anthropic'],
    },
  },
})

Overview Playground About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out Claude Opus 4.6 by Anthropic. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

About Claude Opus 4.6

Claude Opus 4.6 launched on AI Gateway on February 5, 2026, bringing two significant advances to the Opus tier. First, the context window of 1M tokens. This is the first time an Opus model supports this context size, matching what Sonnet 4.5 gained. For teams needing both Opus-level intelligence and massive context capacity, this closes a key capability gap.

Second, adaptive thinking: a new thinking type parameter (set as thinking: { type: 'adaptive' }) that lets the model decide when and how much to reason, rather than requiring you to specify a fixed thinking budget. Opus 4.6 also supports interleaved thinking and tool calls within a single response. The model can reason, call a tool, reason further about the result, and call another tool, all in one response rather than requiring separate turns. This matters for complex agentic workflows where the best next tool call depends on reasoning about previous results.

Anthropic described the model as built to power agents that handle real-world work, with strength across the entire development lifecycle. To use it, set the model to anthropic/claude-opus-4.6 in the AI SDK, Chat Completions API, Responses API, Messages API, or other API formats. When using the AI SDK, configure providerOptions.anthropic with the thinking and effort parameters as needed.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Legal:Terms

•

Privacy

1.4s

52tps

$5.00/M

$25.00/M

Read:$0.5/M

Write:

$6.25/M

$10/K

+ input costs

—

02/05/2026

Legal:Terms

•

Privacy

1.6s

61tps

$5.00/M

$25.00/M

Read:$0.5/M

Write:

$6.25/M

—

02/05/2026

Legal:Terms

•

Privacy

0.8s

51tps

$5.00/M

$25.00/M

Read:$0.5/M

Write:

$6.25/M

$10/K

+ input costs

—

02/05/2026

More models by Anthropic

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

0.7s

82tps

$5.00/M

$25.00/M

Read:$0.5/M

Write:

$6.25/M

$10/K

+ input costs

—

04/16/2026

1.1s

55tps

$3.00/M

$15.00/M

Read:$0.3/M

Write:

$3.75/M

$10/K

+ input costs

—

02/17/2026

200K

0.4s

112tps

$1.00/M

$5.00/M

Read:$0.1/M

Write:

$1.25/M

$10.00/K

+ input costs

—

10/15/2025

0.9s

57tps

$3.00/M

$15.00/M

Read:

$0.3/M

Write:

$3.75/M

$10.00/K

+ input costs

—

09/29/2025

0.6s

71tps

$3.00/M

$15.00/M

Read:

$0.3/M

Write:

$3.75/M

$10.00/K

+ input costs

—

05/22/2025

200K

0.6s

50tps

$5.00/M

$25.00/M

Read:$0.5/M

Write:

$6.25/M

$10.00/K

+ input costs

—

11/24/2024

What To Consider When Choosing a Provider

Configuration: Requests using the context window of 1M tokens generate much higher per-request token volumes than typical API calls. AI Gateway's cost tracking per request is especially important for budgeting large-context workloads.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Claude Opus 4.6

Best For

Tasks requiring both Opus intelligence and massive context: Entire large codebases, extensive document sets, or long conversation histories where prior Opus models hit context limits
Agentic workflows with interleaved reasoning and tool use: Complex multi-step pipelines where reasoning about tool results before the next call improves outcomes
Workloads where adaptive thinking efficiency matters: Letting the model calibrate its own reasoning depth avoids over-spending thinking tokens on simpler requests in mixed workloads
Programming, analysis, and creative tasks: At Opus depth, the model was specifically described as excelling across all three categories
Agents handling real-world development lifecycle tasks: The announced strength area for this checkpoint

Consider Alternatives When

Primary cost constraint: Sonnet 4.6 approaches Opus-level intelligence at lower cost per token
Smaller context sufficient: Earlier Opus versions are available when the context window of 1M tokens isn't needed
Maximum speed required: Sonnet and Haiku variants serve latency-sensitive use cases better

Conclusion

Claude Opus 4.6 closes the context window gap between the Opus and Sonnet tiers while introducing adaptive thinking, a more efficient approach to reasoning that performs on par with or better than fixed thinking budgets. For teams that need Opus-level intelligence at massive context scale, this is the model.

Frequently Asked Questions

What is adaptive thinking and how does it differ from extended thinking?
Adaptive thinking (thinking: { type: 'adaptive' }) lets the model decide when and how much to reason internally. Unlike extended thinking, you don't set a fixed thinking token budget. The model calibrates reasoning depth per request.
How is the context window of 1M tokens significant for Opus 4.6?
This is the first Opus model to support 1M tokens. Previous Opus models had a 200K context limit. This enables Opus-level analysis of entire large codebases, extensive document collections, or long agent histories in a single request.
Can Opus 4.6 interleave thinking and tool calls in one response?
Yes. Claude Opus 4.6 can interleave thinking and tool calls within a single response, reasoning about a problem, calling a tool, reasoning about the result, and calling another tool, all in one turn.
How do I configure adaptive thinking and effort together in the AI SDK?
Under providerOptions.anthropic in the AI SDK, set thinking.type to adaptive and pass an effort level (for example, max). Set the model to anthropic/claude-opus-4.6.
Does the effort parameter work the same way in Opus 4.6 as in Opus 4.5?
Yes, the effort parameter controls overall token usage across all token types, defaults to a high level, and operates independently of the thinking configuration. Both parameters can be set separately.
What real-world work is Opus 4.6 specifically designed to handle?
Programming, analysis, and creative tasks across the development lifecycle. The hybrid reasoning architecture lets you use extended thinking for complex problems while keeping latency low for straightforward queries.
When was Opus 4.6 made available on AI Gateway?
February 5, 2026.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Claude Opus 4.6

Playground

About Claude Opus 4.6

Providers

More models by Anthropic

What To Consider When Choosing a Provider

When to Use Claude Opus 4.6

Best For

Consider Alternatives When

Conclusion

Frequently Asked Questions

Playground

About Claude Opus 4.6

Providers

More models by Anthropic