Seed 1.6
Seed 1.6 is a sparse Mixture-of-Experts (MoE) model with 23B active parameters out of 230B total, a context window of 256K tokens, and three reasoning modes including adaptive chain-of-thought (CoT) that calibrates thinking depth to question complexity.
import { streamText } from 'ai'
const result = streamText({ model: 'bytedance/seed-1.6', prompt: 'Why is the sky blue?'})What To Consider When Choosing a Provider
- Configuration: If your workload relies on extended thinking, confirm that your chosen provider exposes the full reasoning token budget without truncation. Compare token pricing ($0.25 in, $2 out per million tokens when listed).
- Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
- Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.
When to Use Seed 1.6
Best For
- Long-document analysis: The context window of 256K tokens lets you ingest entire contracts, research papers, or codebases in one prompt
- Mixed-difficulty workloads:
AdaCoTcalibrates compute spend automatically, routing complex queries to full reasoning and simple ones to direct response - Multimodal pipelines: Integrated VLM capability combines text and visual data in the same request
- GUI-based interaction: Tasks that require understanding screenshots or interface layouts alongside natural language instructions
- Competitive academic domains: Mathematics, science, and humanities workloads where benchmark results confirm consistent performance
Consider Alternatives When
- Pure text generation: A dense model may offer more predictable routing behavior when no visual input is involved
- Output length limits: Requests needing more than 32K tokens in a single response exceed capacity
- Deterministic latency: Pipelines can't tolerate variable thinking token overhead from
FullCoTorAdaCoT - Cost-sensitive workloads: A smaller distilled model may meet your quality bar at lower cost per token
Conclusion
Seed 1.6 combines a context window of 256K tokens, sparse MoE efficiency, and three reasoning modes in one deployment. If you previously needed separate fast and slow models, you can consolidate onto a single endpoint and let AdaCoT manage the tradeoff at inference time.
Frequently Asked Questions
What are the three reasoning modes in Seed 1.6 and when should I use each?
The three modes are
FullCoT,NoCoT, andAdaCoT.FullCoTgenerates an extended chain-of-thought trace before answering, best for complex multi-step problems.NoCoTskips the reasoning trace for direct, low-latency responses.AdaCoTselects between the two based on estimated question difficulty, making it the practical default for mixed workloads.Does Seed 1.6 support image inputs?
Yes. Seed 1.6 incorporates Vision-Language Model (VLM) capabilities. You can process both text and visual data in the same request.
What is parallel decoding in the context of Seed 1.6?
It's a training-free inference enhancement that generates additional thinking tokens without changing the base model weights. It deepens reasoning capacity at inference time, yielding eight-point improvements on the BeyondAIME benchmark.
How does the sparse MoE architecture affect cost?
Only 23B of the 230B total parameters activate per forward pass. Inference compute stays closer to a 23B dense model than a 230B one. This lowers per-token cost while retaining the representational capacity of the full parameter count.
What academic benchmarks were used to evaluate Seed 1.6?
ByteDance evaluated Seed 1.6 on China's 2025 Gaokao (683/750 in humanities, ranked first; 648/750 in science, ranked second, rising to 676/750 with higher-resolution images) and India's JEE Advanced entrance exam (top-10 placement, 100% math accuracy across five sampling rounds). See https://console.byteplus.com/ark/region:ark+ap-southeast-1/model/detail?Id=seed-1-6 for the full tables.
Is Seed 1.6 available for commercial use through AI Gateway?
Yes. You can access Seed 1.6 through bytedance via AI Gateway with an API key or OIDC token. You don't manage upstream provider credentials yourself.