Claude Opus 4.7 Benchmarks Split as Alibaba Launches High-Efficiency Qwen3.6 MoE Model
Large Language Models & Performance
Claude Opus 4.7 Performance Benchmarks: Benchmarks for the new Claude Opus 4.7 reveal a significant performance regression in thematic generalization compared to version 4.6, specifically struggling with maintaining constraints in complex reasoning tasks. However, comparative data suggests it remains highly competitive in other areas, frequently matching or exceeding the performance of GPT-5.4 and Gemini 3.1 Pro in agentic coding and search.
- Claude Opus 4.7 (high) unexpectedly performs significantly worse than Opus 4.6 (high) on the Thematic Generalization Benchmark: 80.6 → 72.8.
- Claude Opus 4.7 benchmarks
Open-Source & Local Models
Qwen3.6-35B-A3B Release and Updates: Alibaba has released Qwen3.6-35B-A3B, a sparse Mixture of Experts (MoE) model featuring 35B total and 3B active parameters that outperforms models ten times its active size. This version introduces a "preserve_thinking" feature to improve cache efficiency and has already received an "Uncensored Aggressive" variant for users requiring no refusals.
- Qwen3.6-35B-A3B released!
- Qwen3.6-35B-A3B Uncensored Aggressive is out with K_P quants!
- PSA: Qwen3.6 ships with preserve_thinking. Make sure you have it on.
PrismML Releases Ternary Bonsai Models: PrismML has launched Ternary Bonsai, a family of 1.58-bit language models available in 8B, 4B, and 1.7B parameter sizes. By using ternary weights, these models achieve a memory footprint roughly 9x smaller than standard 16-bit models while maintaining high performance levels for their size.
Product & Feature Launches
Perplexity AI Launches "Personal Computer" Feature: Perplexity AI has introduced "Personal Computer," a new capability within its Mac application that allows for secure orchestration across local files, native apps, and browsers. The feature is currently rolling out to Perplexity Max subscribers and those on the waitlist to improve local workflow automation.