OpenAI Unveils o3 Model with Breakthrough Reasoning Capabilities
OpenAI announced the o3 model family during its "12 Days of OpenAI" event, achieving 96.7% on AIME 2024 and 25.2% on the Frontier Math benchmark.
OpenAI announced the o3 model family on December 20, 2024, during its "12 Days of OpenAI" event. The new models demonstrate breakthrough performance on reasoning benchmarks, including a remarkable 25.2% score on the Frontier Math benchmark where no previous model exceeded 2%.
Benchmark Results
The o3 model achieves unprecedented scores across multiple benchmarks:
- AIME 2024: 96.7% (missing only one question on the American Invitational Mathematics Exam)
- GPQA Diamond: 87.7% on graduate-level science questions
- Frontier Math: 25.2% (previous best was under 2%)
- ARC-AGI: Significant improvements on abstract reasoning tasks
Model Family
Like its predecessor o1, the o3 release includes multiple variants:
- o3: Full-scale model for complex reasoning tasks
- o3-mini: Smaller, faster model optimized for specific use cases
Adjustable Reasoning Time
A key innovation in o3 is the ability to adjust reasoning compute. Users can set the model to low, medium, or high compute modes, trading off speed for accuracy based on task requirements.
Naming Convention
OpenAI skipped "o2" to avoid trademark conflicts with the O2 mobile carrier brand, jumping directly from o1 to o3.
Access and Availability
OpenAI initially opened applications for safety and security researchers to test o3 before January 10, 2025. The o3-mini model was released to all ChatGPT users on January 31, 2025, with the full o3 model following in April 2025.
Implications
The o3 results suggest significant progress in AI reasoning capabilities, particularly for mathematical and scientific problems. However, researchers note that benchmark performance doesn't always translate directly to real-world utility.
Related Articles
NVIDIA GTC 2026 Keynote: Jensen Huang Unveils Vera Rubin Platform and Six New Chips
NVIDIA CEO Jensen Huang opened GTC 2026 in San Jose with the formal unveiling of the complete Vera Rubin GPU platform — six new chips featuring 288 GB of HBM4 memory, 336 billion transistors, and 50 PetaFLOPS of FP4 performance. Over 30,000 attendees from 190 countries gathered for the AI industry's most anticipated annual event.
OpenAI Acquires Promptfoo to Strengthen AI Agent Security and Red-Teaming
OpenAI has agreed to acquire Promptfoo, the open-source AI security and red-teaming platform used by over 25% of the Fortune 500, in a deal that will integrate the tool directly into OpenAI's enterprise agent platform. The acquisition signals OpenAI's growing focus on safety infrastructure as it pushes deeper into autonomous AI agent deployment.
NVIDIA Releases Nemotron 3 Super: Open 120B-Parameter Model Targets Enterprise Agentic AI
NVIDIA has released Nemotron 3 Super, a 120-billion-parameter open-weights model built on a hybrid Mamba-Transformer architecture with a one-million-token context window. The model delivers 5x throughput improvements over its predecessor and is designed specifically for enterprise agentic AI workflows.