Skip to content

GLM 4.5

GLM 4.5 is Z.ai's full-scale model released July 28, 2025, unifying reasoning, coding, and agentic capabilities in a single endpoint. Available through AI Gateway with built-in observability and intelligent provider routing.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'zai/glm-4.5',
prompt: 'Why is the sky blue?'
})

What To Consider When Choosing a Provider

  • Configuration: GLM 4.5 supports a context window of 131.1K tokens and up to 131.1K tokens per request. For reasoning-heavy tasks with thinking enabled, budget extra output tokens for chain-of-thought traces that precede the final answer.
  • Configuration: Test both thinking-enabled and thinking-disabled modes. Thinking mode improves accuracy on complex reasoning but increases latency and token usage. Disable it for straightforward generation tasks.
  • Configuration: When using AI Gateway, configure fallback providers to maintain availability. GLM 4.5 is available through zai, novita.
  • Zero Data Retention: AI Gateway does not currently support Zero Data Retention for this model. See the documentation for models that support ZDR.
  • Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use GLM 4.5

Best For

  • General-purpose reasoning and coding: Unified capability across math, logic, and code generation reduces the need for task-specific models
  • Agentic workflows: Multi-step planning, tool use, and configurable thinking run within a single model
  • Long-document analysis: The context window of 131.1K tokens fits contracts, research papers, or large codebases
  • Production deployments: Built-in observability and automatic retries through AI Gateway reduce operational overhead
  • Cost-conscious teams: Compare listed rates against alternative providers when evaluating total spend

Consider Alternatives When

  • Lightweight high-volume alternative: GLM-4.5-Air offers reduced latency and cost for less demanding workloads
  • Vision or multimodal input: GLM-4.5V adds image understanding on top of the GLM-4.5 foundation
  • Code generation focus: GLM-4.6 and later models include targeted coding improvements
  • Deeper reasoning and planning: GLM-5 introduces multiple thinking modes and improved long-range planning

Conclusion

GLM 4.5 is Z.ai's full-capability model in the GLM-4.5 generation, balancing reasoning depth, coding proficiency, and agentic flexibility. For teams that need a single model covering a broad range of tasks with configurable thinking, it's available through AI Gateway with unified billing and observability.

Frequently Asked Questions

  • What is the difference between GLM 4.5 and GLM-4.5-Air?

    GLM 4.5 is the full-scale model optimized for maximum capability across reasoning, coding, and agentic tasks. GLM-4.5-Air is a lighter variant designed for lower latency and cost on less demanding workloads.

  • Does GLM 4.5 support configurable thinking?

    Yes. You can enable or disable chain-of-thought reasoning per request. Thinking mode improves accuracy on complex tasks but increases output length and latency.

  • What is the context window for GLM 4.5?

    131.1K tokens, supporting long documents, extended conversations, and multi-file code analysis in a single request.

  • How much does GLM 4.5 cost through AI Gateway?

    Pricing appears on this page and updates as providers adjust their rates. AI Gateway routes traffic through the configured provider.

  • How do I authenticate with GLM 4.5 through AI Gateway?

    AI Gateway provides a unified API key. You don't need a separate Z.ai account. Configure your API key in your environment, then use the model identifier to route requests. BYOK is also supported if you have a direct provider account.

  • Can I use GLM 4.5 for agentic applications with tool use?

    Yes. GLM 4.5 supports agentic workflows with multi-step planning and tool use. The configurable thinking mode lets you control reasoning depth per step in your pipeline.

  • What providers serve GLM 4.5 through AI Gateway?

    GLM 4.5 is available through zai, novita. AI Gateway handles intelligent routing and automatic retries across configured providers.