Alibaba Launches Qwen 3.5: Open-Weight Model Claims to Beat GPT-5.2 and Claude on 80% of Benchmarks
Alibaba releases Qwen 3.5, a 397-billion-parameter mixture-of-experts model with visual agentic capabilities, claiming benchmark superiority over GPT-5.2, Claude Opus 4.5, and Gemini 3 Pro while costing 60% less to run than its predecessor.
Alibaba released Qwen 3.5 on February 16, 2026, a mixture-of-experts model with 397 billion total parameters and 17 billion active parameters per inference pass. The model claims to outperform OpenAI's GPT-5.2, Anthropic's Claude Opus 4.5, and Google's Gemini 3 Pro on 80% of evaluated benchmarks — though these are self-reported results that have not been independently verified.
Architecture and Performance
Qwen 3.5 uses a mixture-of-experts (MoE) architecture, which routes each input to a subset of the model's total parameters rather than activating all parameters for every token. With 17 billion active parameters out of 397 billion total, the model achieves a favorable trade-off between capability and compute cost: it delivers frontier-class performance while costing 60% less to run than its predecessor and processing large workloads 8x faster.
The benchmark scores are competitive with the best Western closed-source models: 83.6 on LiveCodeBench v6 (coding), 91.3 on AIME26 (mathematical reasoning), and 88.4 on GPQA Diamond (graduate-level science). These scores place Qwen 3.5 in the same performance tier as GPT-5.2 and Claude Opus 4.6 on standardized evaluations, though benchmark performance does not always translate linearly to real-world application quality.
Visual Agentic Capabilities
Qwen 3.5 introduces what Alibaba calls "visual agentic capabilities" — the ability to autonomously interact with mobile and desktop applications by observing screen content and generating actions. This positions Qwen 3.5 alongside Anthropic's computer use feature and Google's Project Astra as a model that can operate software interfaces rather than just generate text about them. The practical applications include automated testing, workflow automation, and accessibility assistance.
Language Coverage
The model supports 201 languages and dialects, up from 82 in the previous Qwen generation. This expansion significantly broadens the model's addressable market in regions that are underserved by English-centric Western models, and is consistent with Alibaba's commercial interest in serving customers across Asia, the Middle East, Africa, and Latin America through its cloud and e-commerce platforms.
Open Weights and Market Impact
Qwen 3.5 is available as an open-weight model, meaning the trained parameters are publicly downloadable and can be run, fine-tuned, and deployed by any organization. This positions it as a direct alternative to Meta's Llama series and a challenge to the closed-source models from OpenAI and Anthropic that require API access and per-token pricing.
The release timing — on the eve of the Chinese Lunar New Year — was strategic, landing ahead of an anticipated DeepSeek V4 release. The Chinese AI ecosystem is now producing multiple frontier-competitive models per quarter, compressing the capability gap between Chinese and American labs and intensifying the competitive pressure on pricing, performance, and openness across the global AI market.
Related Articles
NVIDIA GTC 2026 Keynote: Jensen Huang Unveils Vera Rubin Platform and Six New Chips
NVIDIA CEO Jensen Huang opened GTC 2026 in San Jose with the formal unveiling of the complete Vera Rubin GPU platform — six new chips featuring 288 GB of HBM4 memory, 336 billion transistors, and 50 PetaFLOPS of FP4 performance. Over 30,000 attendees from 190 countries gathered for the AI industry's most anticipated annual event.
OpenAI Acquires Promptfoo to Strengthen AI Agent Security and Red-Teaming
OpenAI has agreed to acquire Promptfoo, the open-source AI security and red-teaming platform used by over 25% of the Fortune 500, in a deal that will integrate the tool directly into OpenAI's enterprise agent platform. The acquisition signals OpenAI's growing focus on safety infrastructure as it pushes deeper into autonomous AI agent deployment.
NVIDIA Releases Nemotron 3 Super: Open 120B-Parameter Model Targets Enterprise Agentic AI
NVIDIA has released Nemotron 3 Super, a 120-billion-parameter open-weights model built on a hybrid Mamba-Transformer architecture with a one-million-token context window. The model delivers 5x throughput improvements over its predecessor and is designed specifically for enterprise agentic AI workflows.