1 min read

OpenAI’s $450B Server Push & World Model Race as LongCat Smashes SOTA Reasoning

New AI Models & Research Breakthroughs

LongCat-Flash-Thinking Achieves SOTA Performance in Reasoning Tasks: The new open-source model, LongCat-Flash-Thinking, sets state-of-the-art benchmarks in logic, math, coding, and agent tasks while using 64.5% fewer tokens for top-tier accuracy on AIME25. It features a 3x speedup via asynchronous RL infrastructure and is optimized for agent-friendly applications.


Baidu Releases Qianfan-VL Multimodal Models (70B/8B/3B): Baidu’s new Qianfan-VL series supports OCR, document understanding, and chain-of-thought reasoning. The 70B/8B models use Llama 3.1 architecture, while the 3B variant is based on Qwen2.5-3B, all featuring an InternViT-based vision encoder.


OpenAI Developing Foundational World Model (Competing with Google’s Genie 3): OpenAI is actively researching a world model—a foundational AI system capable of simulating real-world environments, signaling a shift beyond traditional language models.


AI Infrastructure & Compute Investments

OpenAI Plans $450B Server Spending by 2030 (Including $100B for Backup Cloud): To address compute bottlenecks, OpenAI will invest $450B in server rentals, with $100B allocated for backup cloud capacity, aiming to accelerate research and product scaling.


Cost-Efficient AI Architectures

Major AI Labs May Release Ultra-Cheap Models (Inspired by Grok-4-fast): Companies like Google, OpenAI, and Anthropic could leverage Jet-Nemotron-based architectures to slash AI model costs while maintaining performance, potentially enabling larger context windows at lower prices.


Developer Tools & Updates

Codex CLI Adds "/limits" Command for Rate Limit Monitoring: The upcoming Codex CLI update will introduce a "/limits" command to track hourly/weekly usage, with visual warnings for approaching thresholds.