1 min read

**GPT-5 Pro Outperforms Gemini 2.5 in Math as OpenAI Drops $7B on AI R&D**

Model Benchmarks & Performance

GPT-5 Pro Surpasses Gemini 2.5 Deep Think in FrontierMath Tier 4: OpenAI’s GPT-5 Pro achieved a 13% score in the FrontierMath Tier 4 benchmark, outperforming Google’s Gemini 2.5 Deep Think (12%) and significantly surpassing Grok 4 Heavy. The result highlights advancements in AI’s mathematical problem-solving capabilities.


AI Research & Development

OpenAI’s $7B Compute Spend in 2024: Heavy Focus on R&D: OpenAI allocated the majority of its $7 billion compute budget to research and development, including experimental runs and unreleased models, with only a small portion used for final training runs. This underscores the company’s aggressive investment in advancing AI capabilities.


Open-Source Models & Hardware Optimization

OpenAI Releases Open-Weight LLMs (gpt-oss-20b & gpt-oss-120b): OpenAI introduced its first open-weight models since GPT-2, with a team developing "gpt-oss-amd," a C++ implementation optimized for AMD GPUs to maximize inference throughput without external dependencies.


SVD Distillation Experiment on GLM-4.5-Air & GLM-4.6: A user tested Singular Value Decomposition (SVD) to distill GLM-4.5-Air into a smaller model, but the output lacked coherence. The experiment highlights challenges in model compression while preserving performance.


AI Tools & Developer Workflows

Preference-Aware Routing for Claude Code 2.0 via Arch Gateway: The Arch-Router team extended its multi-LLM routing system to Claude Code 2.0, enabling task-specific model assignments (e.g., code generation, reviews, debugging) within a single CLI agent for optimized coding workflows.