Mistral AI Replaces Manual Coding With Agents While OpenAI Reshuffles Product Strategy
Industry News & Corporate Strategy
OpenAI Reorganizes Product Strategy Under Greg Brockman: Greg Brockman has taken permanent control of OpenAI's product strategy, consolidating ChatGPT and Codex into a unified experience focused on agentic capabilities. The move follows reports of internal dissatisfaction regarding the quality of the company's integration with Apple services.
- OpenAI feels “burned” by Apple’s crappy ChatGPT integration, insiders say
- Greg Brockman Officially Takes Control of OpenAI’s Products in Latest Shakeup
Mistral AI Shifts to Fully Automated Coding Workflow: Mistral AI's founder revealed that their engineers no longer write manual code, instead supervising AI agents that generate code based on specifications. This shift has led to significant individual productivity gains, though organizational challenges remain for team-wide scaling.
- Mistral AI founder to French Parliament: "Engineers at Mistral no longer write a single line of code
Cybersecurity & AI Safety
Anthropic's Mythos AI Demonstrates Advanced Hacking Capabilities: Mythos AI has shown significant prowess in cybersecurity, identifying 18 of 41 n-day exploits and helping researchers build a kernel exploit for Apple's M5 security in just five days. The AI-driven exploit bypassed Apple’s Memory Integrity Enforcement (MIE), prompting a report to Apple ahead of a full technical disclosure.
- Elite researchers teamed up with Anthropic’s Mythos AI to smash Apple’s multi-billion dollar M5 security and build a kernel exploit in just 5 days.
- More evidence of Mythos's strength in Cybersecurity/Hacking - compared to 5.5, it got 18/41 n-day exploits, vs 1/41. Open Source/Weights models get nothing
Model Developments & Benchmarks
Qwen3.6 and Qwen3.5 Models Excel in Agentic and Reasoning Benchmarks: The Qwen3.6-35B-A3B model has officially topped the Terminal-Bench 2.0 leaderboard, outperforming Gemini 2.5 Pro. Furthermore, a new method for dynamically allocating compute budget using Qwen-35B-A3B has shown results approaching the performance of GPT-5.4-xHigh on the HLE benchmark.
- Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard!
- Dynamically allocating compute budget to hard set of problems and evolving the sections with Qwen-35B-A3B gets you near GPT-5.4-xHigh on HLE
Inference Optimization & Technical Innovation
New Architectures and Kernels Drastically Speed Up Local Inference: Orthrus-Qwen3-8B utilizes a diffusion attention module to achieve nearly 8x faster token generation while maintaining identical output distribution to the base model. Simultaneously, projects like Open-dLLM and the Luce Megakernal are pushing performance limits on NVIDIA GPUs, with some benchmarks targeting over 3,000 tokens per second.
- Orthrus-Qwen3-8B : up to 7.8×tokens/forward on Qwen3-8B, frozen backbone, provably identical output distribution
- Can a 5090 with qwen3.6 achieve > 3,000 tok/s ? bring your pitchforks (open-dllm)
- https://oval-shell-31c.notion.site/Open-dLLM-Open-Diffusion-Large-Language-Model-25e03bf6136480b7a4ebe3d53be9f68a
- https://arxiv.org/pdf/2605.07933v1
- https://x.com/Viacheslav91112/status/2054613430082957443?s=20
- https://github.com/scrya-com/Open-dLLM
- https://wandb.ai/snoozie/Qwen3.6-35B-A3B-LDLM?nw=nwusersnoozie
- http://vast.ai
- https://www.reddit.com/gallery/1tee5ms
- Luce Megakernal: Why nobody is taking about this?
Hardware & Open Source Tools
Innovations in Local AI Agents and Edge Robotics: A developer has built "Sparky," a fully offline suitcase robot running Gemma 4 E4B on a Jetson Orin NX with 30+ sensors. Additionally, the new open-source Equibles MCP server allows local LLMs to access real-time financial data, such as SEC filings and congressional trades, without relying on cloud APIs.
- Built a fully offline suitcase robot around a Jetson Orin NX SUPER 16GB. Gemma 4 E4B, ~200ms cached TTFT, 30+ sensors, no WiFi/BT/cellular. He has opinions.
- I built a self-hosted open-source MCP server that gives any local LLM real financial data — SEC filings, 13F, insider & congressional trades, short data, FRED