12 Feb 2026 2 min read

GLM-5 Sets Open-Weights Record as Google Gemini Dominates New Math and Logic Benchmarks

Large Language Models & Releases

GLM-5 Official Release: GLM-5 has launched as the new leader in open-weights models, scaling from 355B to 744B parameters and trained on 28.5T tokens. The model integrates DeepSeek Sparse Attention (DSA) to reduce deployment costs and is already available in GGUF format for local use via Unsloth.

Google Gemini Evolution and Aletheia: Google is reportedly preparing updates for the Gemini family, with sightings of a "Gemini 3.1 Pro Preview" and hints from leadership about a release "even better" than Gemini 3 Pro. Additionally, a specialized math version called Aletheia has been revealed, achieving a perfect score on the International Mathematical Olympiad (IMO).

Industry Announcements & Partnerships

NVIDIA Standardizes on OpenAI Codex: NVIDIA is rolling out OpenAI Codex to its 30,000 engineers to handle complex workflows and context management. The partnership includes customized cloud-managed admin controls and strict US-only processing fail-safes for the company-wide deployment.

NVIDIA appears to be standardizing on OpenAI Codex

Mistral AI Invests in European Infrastructure: Mistral AI's CEO has announced a €1.2 billion investment in a Swedish data center to strengthen Europe's AI infrastructure. Along with the investment, the company is calling for greater European unity to remain competitive in the global AI race.

Mistral boss calls for European unity in AI race, as pledges €1.2bn Swedish data centre investment
- https://tech.eu/2026/02/11/mistral-boss-calls-for-european-unity-in-ai-race-as-pledges-1-2bn-swedish-data-centre-investment

Benchmarks & Technical Advancements

The Car Wash Test for Logic: A new common-sense reasoning benchmark called the "Car Wash Test" has been introduced, challenging AI models with simple text logic. Currently, only Google's Gemini models (Pro and Fast) have passed the test, while ChatGPT 5.2 failed to solve the riddle on its first attempt.

The Car Wash Test: A new and simple benchmark for text logic. Only Gemini (pro and fast) solved the riddle.
- https://www.reddit.com/gallery/1r2ndfz

Samsung REAM Model Compression: Samsung has introduced REAM, a more efficient alternative to the REAP method for compressing and shrinking AI models. REAM aims to preserve model information more effectively and reduce the negative impacts typically associated with quantization and model "lobotomization."

Lobotomy-less REAP by Samsung (REAM)