2 min read

DeepSeek’s Engram & Baichuan-M3-235B Push LLM Limits as GPT-5.2 Solves Spherical Packing

New AI Models & Research

DeepSeek Introduces Engram: Memory Lookup Module for Next-Gen LLMs
DeepSeek has open-sourced Engram, a memory-augmented module for LLMs that uses hashed N-gram embeddings for deterministic O(1) lookup, improving performance in knowledge, reasoning, code, and math tasks. The module decouples memory and compute as separate scaling axes, offering consistent gains across benchmarks.


Baichuan-M3-235B: Medical LLM Surpassing GPT-5.2 in Benchmarks
Baichuan AI’s Baichuan-M3-235B is a next-gen medical LLM optimized for clinical decision-making, achieving lower hallucination rates and higher efficiency than GPT-5.2 via techniques like Fact-Aware RL and W4 quantization.


GPT-5.2 Pro Agent Breaks Spherical Packing Record
OpenAI’s GPT-5.2 Pro Agent set a new benchmark in experimental mathematics by optimizing a spherical packing configuration, outperforming prior human and computational efforts.


NVIDIA Research: Real-Time Model Weight Updates for Continual Learning
NVIDIA’s new research enables real-time weight updates in AI models, addressing catastrophic forgetting and misalignment but requiring pruning to manage model growth.


AI Products & Services

Claude Cowork: AI-Assisted Task Automation (Research Preview)
Anthropic’s Cowork feature for Claude Max (macOS) allows AI to interact with local files—organizing folders, editing documents, and generating drafts from notes—via a research preview for subscribers.


Shopify CEO Builds MRI Viewer with Claude AI
Shopify’s Tobi Lutke used Claude AI to create a lightweight, HTML-based MRI viewer from raw USB data, demonstrating LLMs’ potential to replace specialized software tools.


SurfSense: Open-Source Alternative to Glean/NotebookLM
SurfSense is an OSS platform connecting LLMs to internal knowledge sources (drives, calendars, etc.) with deep agentic capabilities, role-based access, and support for 100+ models.


4B Text2SQL Model Matches 685B LLM Performance
A 4B-parameter SLM fine-tuned by Distil Labs achieves 685B-LLM-level accuracy in converting English queries to SQL, running locally with high speed.


Headroom: Tool Output Compression for AI Agents (60-90% Token Reduction)
Headroom is an open-source tool that compresses agent tool outputs, reducing token usage by 60-90% by filtering relevant data (errors, outliers, query matches) with minimal latency.


GitNexus: Client-Side Code Intelligence Engine
GitNexus is an open-source, browser-based engine for deep codebase analysis (imports, calls, defines) with graph queries, semantic search, and integration with Claude Code/Cursor.


Major Partnerships & Announcements

Apple’s Siri to Integrate Google Gemini
Apple and Google announced Siri will use Gemini AI starting with iOS 26.4, enabling deeper personal context understanding and app control. Elon Musk criticized the deal for centralizing power.


AI Research & Insights

The Hidden Memory Problem in Coding Agents
The post explores challenges in memory management for coding agents, advocating for structured techniques like compressed memory, intent-driven retrieval, and strategic forgetting over brute-force context dumping.