11 Oct 2025 1 min read

GPT-5 Pro Outperforms Gemini 2.5 in Math as OpenAI Drops $7B on AI R&D

Model Benchmarks & Performance

GPT-5 Pro Surpasses Gemini 2.5 Deep Think in FrontierMath Tier 4: OpenAI’s GPT-5 Pro achieved a 13% score in the FrontierMath Tier 4 benchmark, outperforming Google’s Gemini 2.5 Deep Think (12%) and significantly surpassing Grok 4 Heavy. The result highlights advancements in AI’s mathematical problem-solving capabilities.

GPT-5 Pro Tops FrontierMath Tier 4, Beating Gemini 2.5 Deep Think
- EpochAIResearch Tweet

AI Research & Development

OpenAI’s $7B Compute Spend in 2024: Heavy Focus on R&D: OpenAI allocated the majority of its $7 billion compute budget to research and development, including experimental runs and unreleased models, with only a small portion used for final training runs. This underscores the company’s aggressive investment in advancing AI capabilities.

Epoch: OpenAI spent ~$7B on compute last year, mostly R&D

Open-Source Models & Hardware Optimization

OpenAI Releases Open-Weight LLMs (gpt-oss-20b & gpt-oss-120b): OpenAI introduced its first open-weight models since GPT-2, with a team developing "gpt-oss-amd," a C++ implementation optimized for AMD GPUs to maximize inference throughput without external dependencies.

GPT-OSS from Scratch on AMD GPUs
- GitHub: gpt-oss-amd

SVD Distillation Experiment on GLM-4.5-Air & GLM-4.6: A user tested Singular Value Decomposition (SVD) to distill GLM-4.5-Air into a smaller model, but the output lacked coherence. The experiment highlights challenges in model compression while preserving performance.

Real SVD GLM-4.5-Air-GLM-4.6-Distill
- Hugging Face: GLM-4.5-Air Distill
- Hugging Face: LoRA Variant

AI Tools & Developer Workflows

Preference-Aware Routing for Claude Code 2.0 via Arch Gateway: The Arch-Router team extended its multi-LLM routing system to Claude Code 2.0, enabling task-specific model assignments (e.g., code generation, reviews, debugging) within a single CLI agent for optimized coding workflows.

Preference-aware routing for Claude Code 2.0
- GitHub: Arch Gateway
- Demo: Claude Code Router