Claude Opus 4.5 Crushes Benchmarks as Microsoft Drops Real-Time TTS Model
New AI Models & Releases
Claude Opus 4.5 achieves 95% on CORE-Bench HARD: Anthropic’s latest model, Claude Opus 4.5, scores 95% on the CORE-Bench HARD benchmark, which tests an AI’s ability to reproduce scientific research, code, and results from a paper. This significantly outperforms GPT-5.1 Codex Max (40%) and previous Claude versions.
Microsoft releases VibeVoice-Realtime-0.5B: A lightweight real-time text-to-speech (TTS) model that supports streaming input, enabling near-instant (~300ms) speech generation for live applications like narration or LLM voice integration.
Hermes 4.3 (36B) released by Nous Research: An uncensored, Apache 2.0-licensed model fine-tuned from Seed-OSS-36B-Base, with a distributed-trained version outperforming centralized training. Designed for high performance in local and open-source applications.
AI Tools & Developer Resources
Deep Chat: Open-source Mistral-powered chat component: A feature-rich web component for integrating AI chat (compatible with Mistral, OpenAI, etc.) into any website. Includes customizable UI, API connectors, and MIT licensing.
Drive AI: Agentic file management workspace: An AI-powered tool for creating, organizing, and sharing files via natural language, with features like auto-file sorting, Gmail/Outlook integration, and multi-AI collaboration via an MCP server.
Maxim: Debugging & tracing platform for AI workflows: Addresses challenges in tracing multi-step AI workflows, debugging complex interactions, and integrating human review. Offers evaluation tools, real-time monitoring, and alerting for production reliability.
Security & Vulnerabilities
Critical Next.js vulnerability (CVE-2025-66478): A severe security flaw in Next.js could allow arbitrary code execution or data leaks. Developers are urged to patch immediately.