Skip to content

DeepSeek V3.1

DeepSeek V3.1 is DeepSeek's August 21, 2025 model update introducing hybrid inference with selectable thinking and non-thinking modes in one endpoint. It strengthens tool use and multi-step agent capabilities over DeepSeek-V3.

ReasoningTool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'deepseek/deepseek-v3.1',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Configuration: Two usage modes share the same model. Test both thinking and non-thinking paths in your integration to confirm your application correctly interprets response structure under each mode.
  • Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use DeepSeek V3.1

Best For

  • Mixed agent pipelines: Combine reasoning-heavy steps (tool planning, code generation) with fast-response steps (parsing, classification) through a single endpoint
  • Software engineering automation: SWE-Bench and Terminal-Bench improvements translate to better code generation and execution performance
  • Anthropic API compatibility: Existing Anthropic-format integrations route to DeepSeek V3.1 with minimal integration change
  • Complex multi-step search: The thinking mode's improved efficiency reduces total response latency for multi-step workflows
  • Upgrading from DeepSeek-V3: Backward-compatible API routing plus optional thinking mode

Consider Alternatives When

  • Pure reasoning workloads: DeepSeek-R1 remains the dedicated reasoning specialist
  • Multilingual stability critical: DeepSeek-V3.1 Terminus addresses reliability issues for Chinese-English code-switching output consistency
  • Straightforward chat or completion: DeepSeek-V3 may be more cost-efficient for high-volume workloads without hybrid inference needs

Conclusion

DeepSeek V3.1 consolidates thinking and non-thinking modes into a single endpoint, simplifying deployment for reasoning-capable systems. It adds capability over DeepSeek-V3 for agentic and software engineering tasks.

Frequently Asked Questions

  • What does "hybrid inference" mean for DeepSeek V3.1?

    The same model weights support both a thinking mode (extended chain-of-thought) and a non-thinking mode (direct completion). Select the mode by calling deepseek-reasoner for thinking or deepseek-chat for non-thinking. No separate model switch is needed.

  • Is DeepSeek V3.1's thinking mode faster than DeepSeek-R1?

    Yes. DeepSeek-V3.1-Think reaches answers in less time than DeepSeek-R1-0528 on equivalent tasks.

  • Does DeepSeek V3.1 support the Anthropic API format?

    Yes. Existing Anthropic-format integrations can route to DeepSeek V3.1 without additional conversion.

  • What is strict function calling and is it available in DeepSeek V3.1?

    It's in beta for DeepSeek V3.1. Strict function calling requires tool call arguments to match the provided JSON schema exactly.

  • What is the context window for DeepSeek V3.1?

    163.8K tokens for both thinking and non-thinking modes.