Anthropic Launches Claude Sonnet 4.6 as DeepSeek v4 Benchmarks Leak Online
Large Language Models & Updates
Anthropic Releases Claude Sonnet 4.6: Anthropic has launched Claude Sonnet 4.6, introducing "Adaptive Thinking" to adjust reasoning depth based on task complexity. This version features a 1M token context window in beta and shows significant improvements in coding, computer use, and spatial reasoning.
- Sonnet 4.6 released !!
- Difference Between Sonnet 4.5 and Sonnet 4.6 on a Spatial Reasoning Benchmark (MineBench)
- Sonnet 4.6 released!! Wen gpt 5.3 ??
- Claude Sonnet 4.6 available now
DeepSeek v4 Release Rumors and Leaks: Rumors are circulating regarding the imminent release of DeepSeek v4 following leaked benchmark data. The leaks suggest the model could achieve a record-breaking 83.7% on SWE-Bench Verified, potentially making it a leading model for coding tasks.
Open Source & Research
PrimeIntellect Launches INTELLECT-3.1 Reasoning Model: PrimeIntellect has released INTELLECT-3.1, a 106B parameter Mixture-of-Experts (MoE) model open-sourced under MIT and Apache 2.0 licenses. The model underwent specialized reinforcement learning for math, coding, and agentic tasks using the prime-rl framework.
GLM-5 Technical Report Unveiled: A new technical report details the development of GLM-5, highlighting its use of asynchronous RL infrastructure and agent-specific algorithms. The model aims for state-of-the-art performance in complex, real-world software engineering tasks.
Matmul-Free Model Trained on CPU: FlashLM-v3-13m, a 13.6M parameter language model, was successfully trained on a CPU in just 1.2 hours. By using ternary weights and eliminating matrix multiplications, the model achieves highly efficient inference using only addition and subtraction operations.
Industry & Business News
Mistral AI Acquires Cloud Startup Koyeb: Mistral AI has announced the acquisition of serverless cloud provider Koyeb to bolster its full-stack AI capabilities. Koyeb’s team of 16 engineers will join Mistral to focus on optimizing GPU usage and enhancing inference scaling for AI-native applications.