2 min read

GLM-5.2 Surpasses GPT-5.5 in Benchmarks While Perplexity Launches AI "Brain" System

Large Language Models & Benchmarks

GLM-5.2 Achieves Benchmark Breakthroughs and Local Accessibility: GLM-5.2 has demonstrated industry-leading performance, surpassing Opus 4.8 in coding tasks and GPT-5.5 in agentic knowledge work evaluations. The model is now accessible for local inference through llama.cpp and Unsloth Studio, and it recently featured a period of free inference across several Hugging Face providers.

Claude Fable 5 and Kimi 2.7 Debut on Software Engineering Benchmark: Claude Fable 5 and Kimi 2.7 Code have made their first appearance on the DeepSWE benchmark for software engineering. This evaluation tests the models' abilities to handle real-world coding scenarios and complex software development tasks.

AI Agents & Research

Perplexity Launches 'Brain in Computer' Memory System: Perplexity AI has introduced a continuously learning memory system that builds a context graph of a user's past projects and decisions to assist their Computer agent. The feature is currently in research preview for Max subscribers and claims to significantly improve answer correctness and recall while reducing task costs.

Ohio State University Open-Sources QUEST-35B Deep Research Agent: Researchers have released QUEST-35B, an open-source deep research agent trained using synthetic samples and 32 H100 GPUs. The team has provided the full training recipe, code, weights, and datasets to the public, showing performance competitive with proprietary frontier systems.

Industry & Regulation

Select Companies Retain Anthropic Mythos Access Despite Shutdown: Approximately 200 organizations, including major firms like JPMorgan Chase and AWS, still have access to Anthropic's Mythos Preview via the "Project Glasswing" program. This continued access persists despite a US government order halting broader availability, as the program specifically focuses on collaborative cybersecurity vulnerability research.