2 min read

Poetiq’s ARC-AGI Breakthrough & Google’s Continual Learning Leap Redefine AI Limits

AI Benchmarks & Breakthroughs

Poetiq Achieves SOTA on ARC-AGI 2 Public Eval: Poetiq set a new state-of-the-art (SOTA) score of 75% on the ARC-AGI 2 Public Eval benchmark (surpassing human average of 60%) using GPT5.2 X-HIGH at a cost of $8 per task. The approach leverages multi-agent verification to reduce hallucinations, though it remains expensive and slow. Results on the private dataset are pending verification.


Continual Learning Breakthrough with Google’s Nested Learning: Google introduced Nested Learning, a new ML paradigm claimed to "solve" continual learning—a longstanding challenge where models retain old knowledge while learning new tasks. The paper follows prior advancements like Q* and Strawberry, signaling rapid progress in adaptive AI systems.


AI Models & Releases

Uncensored Qwen3-Next-80B-Thinking: Multiverse Computing released a modified version of Qwen3-Next-80B with Chinese political censorship removed. The model uses steering vectors to disable refusal behavior on sensitive topics without supervised fine-tuning, maintaining robustness against jailbreaks and benchmark performance.


Maincoder-1B: High-Performance 1B-Parameter Coding Model: Australian startup Maincode released Maincoder-1B, a compact (1B parameter) open-source model scoring 76% on HumanEval—exceptional for its size. Optimized for low-latency inference, it targets local/offline coding, batch refactoring, and program synthesis under the Apache 2.0 license.


Plano-Orchestrator: Efficient LLMs for Agent Orchestration: Katanemo’s Plano-Orchestrator is a new LLM family designed to optimize multi-agent systems by dynamically routing tasks to specialized agents. Integrated into the Plano proxy/dataplane, it improves performance in chat, coding, and multi-turn conversations while reducing latency.


AI Tools & Frameworks

Mistral Vibe v1.3.0 Update: Mistral AI’s latest update adds Agent Skills (predefined instruction/script bundles) to enhance task accuracy, along with native terminal themes, reasoning model support, and bug fixes. The release focuses on improving usability for agent-based workflows.


Infrastructure & Hardware

Atlas Eon 100: Scalable DNA Data Storage: Atlas Data Storage unveiled the Atlas Eon 100, the first scalable DNA-based storage system, offering 60PB in 60 cubic inches (1,000x denser than tape) with millennia-long durability. Targeted at archiving AI training datasets, it eliminates active power requirements for long-term storage.