22 Sep 2025 1 min read

OpenAI’s $450B Server Push & World Model Race as LongCat Smashes SOTA Reasoning

New AI Models & Research Breakthroughs

LongCat-Flash-Thinking Achieves SOTA Performance in Reasoning Tasks: The new open-source model, LongCat-Flash-Thinking, sets state-of-the-art benchmarks in logic, math, coding, and agent tasks while using 64.5% fewer tokens for top-tier accuracy on AIME25. It features a 3x speedup via asynchronous RL infrastructure and is optimized for agent-friendly applications.

Baidu Releases Qianfan-VL Multimodal Models (70B/8B/3B): Baidu’s new Qianfan-VL series supports OCR, document understanding, and chain-of-thought reasoning. The 70B/8B models use Llama 3.1 architecture, while the 3B variant is based on Qwen2.5-3B, all featuring an InternViT-based vision encoder.

baidu releases Qianfan-VL 70B/8B/3B

OpenAI Developing Foundational World Model (Competing with Google’s Genie 3): OpenAI is actively researching a world model—a foundational AI system capable of simulating real-world environments, signaling a shift beyond traditional language models.

Now we know OpenAI is actively working on their own foundational World Model like Google's Genie 3

AI Infrastructure & Compute Investments

OpenAI Plans $450B Server Spending by 2030 (Including $100B for Backup Cloud): To address compute bottlenecks, OpenAI will invest $450B in server rentals, with $100B allocated for backup cloud capacity, aiming to accelerate research and product scaling.

OpenAI to spend ~$450B renting servers through 2030, including ~$100B in backup cloud capacity from providers
- The Information Article

Cost-Efficient AI Architectures

Major AI Labs May Release Ultra-Cheap Models (Inspired by Grok-4-fast): Companies like Google, OpenAI, and Anthropic could leverage Jet-Nemotron-based architectures to slash AI model costs while maintaining performance, potentially enabling larger context windows at lower prices.

There is a very real possibility that Google, OpenAI, Anthropic, etc. will release their own super cheap versions of Grok-4-fast!
- arXiv Paper (Jet-Nemotron)

Developer Tools & Updates

Codex CLI Adds "/limits" Command for Rate Limit Monitoring: The upcoming Codex CLI update will introduce a "/limits" command to track hourly/weekly usage, with visual warnings for approaching thresholds.

"/limits" is coming to the codex CLI in the next release