12 Mar 2026 3 min read

NVIDIA’s $26 Billion Open-Weight Bet and Anthropic’s Move Toward Self-Improving AI

Industry Strategy & Corporate Roadmaps

Anthropic Advances Toward Recursive Self-Improvement: Anthropic reports that 70% to 90% of the code for its upcoming models is now generated by Claude itself, signaling a move toward fully automated AI research within a year. CEO Dario Amodei is advocating for government intervention and corporate transparency as the company anticipates rapid AI advancements and potential job displacement between 2026 and 2030.

Claude 4.6 Experiment: "Can you use whatever resources you like, and python, to generate a short 'youtube poop' video and render it using ffmpeg? It should express what it's like to be a LLM."
- https://x.com/josephdviviano/status/2031196768424132881

Nvidia Commits $26 Billion to Open-Weight AI Models: Nvidia plans to invest $26 billion in developing open-weight AI models that can run on various hardware platforms. This strategic investment is intended to maintain Nvidia's dominance in the hardware market by ensuring their GPUs remain the preferred choice for a wide range of accessible AI models.

Nvidia Will Spend $26 Billion to Build Open-Weight AI Models, Filings Show
- https://www.wired.com/story/nvidia-investing-26-billion-open-source-models/

Model Releases & Benchmarks

NVIDIA Releases Nemotron 3 Super: NVIDIA has launched Nemotron 3 Super, a 120B Mixture of Experts (MoE) model featuring a hybrid Mamba-Transformer architecture. The model is designed for agentic reasoning and is released alongside open datasets for pretraining and reinforcement learning to encourage developer customization.

Nemotron 3 Super Released
- https://developer.nvidia.com/blog/introducing-nemotron-3-super-an-open-hybrid-mamba-transformer-moe-for-agentic-reasoning/?nvid=nv-int-csfg-844859

Stealth Models Hunter Alpha and Healer Alpha Appear on OpenRouter: Two new AI models, Hunter Alpha and Healer Alpha, have been released and are suspected to be part of the Deepseek v4 series. These models exhibit characteristics and political alignments typical of Chinese AI models, with rumors suggesting they may be omnimodal with over 1 million parameters.

Two new Stealth models on OpenRouter: Hunter Alpha & Healer Alpha
- https://x.com/aibattle_/status/2031834303827681727?s=46
- https://www.reddit.com/gallery/1rr7433

Performance Benchmarking of Qwen3.5-397B on Blackwell GPUs: Extensive testing of the Qwen3.5-397B model on RTX PRO 6000 Blackwell GPUs revealed a maximum sustained decode speed of 50.5 tokens per second. The benchmarks also identified a specific bug in NVIDIA's CUTLASS kernels that currently prevents the use of native FP4 compute on SM120 hardware.

I spent 8+ hours benchmarking every MoE backend for Qwen3.5-397B NVFP4 on 4x RTX PRO 6000 (SM120). Here's what I found.

Consumer Products & Integration

Perplexity Launches Personal Computer Service: Perplexity AI announced "Personal Computer," a service that integrates its cloud-based "Perplexity Computer" digital worker with a local, always-on Mac mini. For $200 a month, users gain access to a persistent AI agent capable of handling complex research, design, and automation workflows across multiple top-tier models like Claude and Gemini.

OpenAI to Integrate Sora Video Generation into ChatGPT: OpenAI plans to add its Sora AI video generator to ChatGPT in an effort to revitalize its user base. The integration aims to enhance user engagement by providing high-end video creation tools directly within the chatbot interface.

OpenAI plans to include Sora AI video generator within ChatGPT to revive declining user base

Developer Tools & AI Architecture

Llama.cpp Introduces Reasoning Budgets: A new update to Llama.cpp allows users to set a "reasoning budget," limiting the number of tokens a model can use for its internal reasoning process. This feature provides better control over computational resources and ensures models conclude their reasoning within user-defined constraints.

Llama.cpp now with a true reasoning budget!

New Architectural Approach for AI Agents: A former lead at Manus is advocating for AI agents to use a single Unix-style CLI tool command instead of traditional function calling. This approach leverages an LLM's inherent knowledge of command-line interfaces to improve efficiency in navigating complex file systems and apps.

I was backend lead at Manus. After building agents for 2 years, I stopped using function calling entirely. Here's what I use instead.
- https://github.com/epiral/pinix
- https://github.com/epiral/agent-clip