Chinese Open Source AI Models GLM-4.5 and GLM-4.5-Air Show Top Performance
Another week in the summer of 2025 has begun, and in a continuation of the trend from last week, more powerful Chinese open source AI models are arriving.
Little-known (to us in the West) Chinese startup Z.ai has introduced two new open source LLMs — GLM-4.5 and GLM-4.5-Air — positioning them as solutions for AI reasoning, agentic behavior, and coding.
According to Z.ai’s blog post, these models perform near the top of proprietary LLM benchmarks, matching or surpassing models like Claude 4 Sonnet, Claude 4 Opus, and Gemini 2.5 Pro.
Both models support reasoning, coding, and agentic capabilities. The flagship GLM-4.5 outperforms many leading models on evaluations such as BrowseComp, AIME24, and SWE-bench Verified, ranking third overall across multiple tests.
The AI Impact Series Returns to San Francisco - August 5
The next phase of AI is here - are you ready? Join leaders from Block, GSK, and SAP for an exclusive look at how autonomous agents are reshaping enterprise workflows—register here.
Its lighter-weight sibling, GLM-4.5-Air, also ranks highly, providing strong performance with lower resource needs.
Both models have dual modes: a thinking mode for complex tasks and a non-thinking mode for quick responses. They can generate complete PowerPoint presentations from prompts, enhancing meeting prep, education, and report creation.
They support creative writing, emotional copywriting, script generation for social media and web content, virtual character development, and turn-based dialogue systems for customer support, roleplaying, or storytelling.
GLM-4.5-Air is designed for teams needing a lighter model with faster inference and lower costs. It supports specialized versions like GLM-4.5-X, GLM-4.5-AirX for ultra-fast inference, and GLM-4.5-Flash for coding tasks.
Available on Z.ai and via the Z.ai API, the models’ code is also on HuggingFace and ModelScope, with support for inference through tools like vLLM and SGLang.
Licensing and API Pricing
Both models are released under Apache 2.0 license, allowing free use, modification, self-hosting, and redistribution for research and commercial use.
API prices are roughly:
- GLM-4.5: $0.60 / $2.20 per million tokens
- GLM-4.5-Air: $0.20 / $1.10 per million tokens
CTSNBC reports that costs are as low as $0.11 / $0.28 per million tokens for input/output, applicable up to 32,000 tokens input/output at a time.

The detailed pricing tables indicate costs vary with batch size, with lower costs for larger batches.

Note: As Z.ai is based in China, Western users should consider data sovereignty issues and local policies before using the API.
Performance on Benchmarks Approaching Leading Proprietary Models

GLM-4.5 ranks third among 12 industry benchmarks for reasoning, reasoning, and coding, trailing only GPT-4 and Grok 4. GLM-4.5-Air places sixth.
In agentic tasks, GLM-4.5 matches Claude 4 Sonnet and outperforms Claude 4 Opus, with notable scores on BrowseComp and other reasoning benchmarks. It also excels in coding success rates and tool-calling reliability, surpassing competitors.
Part of the Chinese Open-Source Wave
This release aligns with the surge of open-source LLMs from China, notably from Alibaba’s Qwen team, which launched four models recently, including Qwen3-235B-A22B-Thinking, outperforming some US models in reasoning.
Alibaba also released Wan 2.2, a new open source video model. Similar to Z.ai’s models, Alibaba’s are licensed under Apache 2.0, supporting commercial use and self-hosting.
This trend supports Chinese firms’ strategy to promote open-source AI as an alternative to US-based closed models, adding competitive pressure and fostering innovation.
Meanwhile, efforts from Meta and OpenAI reflect the broader global AI landscape, with Meta’s Llama 4 encountering mixed reviews and OpenAI’s new open model delayed.
Architecture and Training Insights
GLM-4.5 uses 355 billion total and 32 billion active parameters, with its lighter sibling GLM-4.5-Air at 106 billion total and 12 billion active parameters. Both employ Mixture-of-Experts architecture, optimized with advanced techniques like loss-free routing and sigmoid gating.
Pre-training involved 22 trillion tokens, with additional 1.1 trillion tokens added during mid-training from code, reasoning, and long-context sources. The training used reinforcement learning with In-house infrastructure, mixed-precision rollout, and adaptive curriculum learning.
According to CNBC, the models run efficiently on just eight Nvidia H20 GPUs, a hardware choice made to comply with export restrictions.
Interactive Demos
Examples include a Flappy Bird clone, Pokémon Pokédex web app, and structured document or web query-based slide decks. These are accessible via the Z.ai platform and API, allowing users to create and interact with AI-powered artifacts.

The models can be accessed seamlessly through the API and chat platform for interactive experiments.
About Z.ai
Founded in 2019 as Zhipu, Z.ai has grown into a leading Chinese AI startup, backed by over $1.5 billion from major investors including Alibaba, Tencent, and others. Its GLM-4.5 release coincided with the Shanghai AI Conference, highlighting China’s advances in AI technology. The company has been featured in reports on Chinese AI progress and is on the U.S. entity list, affecting its international business.
Implications for Enterprise Leaders
The availability of GLM-4.5 under an open-source license provides enterprise teams with a high-performance, controllable alternative to proprietary systems, facilitating deployments across cloud, private, or on-premises environments. Its advanced features, flexible interfaces, and competitive pricing make it a compelling choice for organizations aiming to innovate while maintaining control over their AI infrastructure.